SPECIFICATION-GUIDED REINFORCEMENT LEARNING

Jothimurugan, Kishor

SPECIFICATION-GUIDED REINFORCEMENT LEARNING

Files

Jothimurugan_upenngdas_0175C_15736.pdf (7.06 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences

Subject

formal specifications
reinforcement learning
reward shaping
temporal logic

Copyright date

2023

Permalink

https://repository.upenn.edu/handle/20.500.14332/59054

View all metadata

Author

Jothimurugan, Kishor

Abstract

Recent advances in Reinforcement Learning (RL) have enabled data-driven controller design for autonomous systems such as robotic arms and self-driving cars. Applying RL to such a system typically involves encoding the objective using a reward function (mapping transitions of the system to real values) and then training a neural network controller (from simulations of the system) to maximize the expected reward. However, many challenges arise when we try to train controllers to perform complex long-horizon tasks—e.g., navigating a car along a complex track with multiple turns. Firstly, it is quite challenging to manually define well-shaped reward functions for such tasks. It is much more natural to use a high-level specification language such as Linear Temporal Logic (LTL) to specify these tasks. Secondly, existing algorithms for learning controllers from logical specifications do not scale well to complex tasks due to a number of reasons including the use of sparse rewards and lack of compositionality. Furthermore, existing algorithms for verifying neural network controllers (trained using RL) cannot be easily applied to verify controllers for complex long-horizon tasks due to large approximation errors. This thesis proposes novel techniques to overcome these challenges. We show that there are inherent limitations in obtaining theoretical guarantees regarding RL algorithms for learning controllers from temporal specifications. We then preset compositional RL algorithms that achieve state-of-the-art performance in practice by leveraging the structure in the given logical specification. Finally, we show that compositional approaches to learning enable faster verification of learned controllers containing neural network components.

Advisor

Alur, Rajeev

Date of degree

2023

Collection

Dissertations and Theses