Hahn, EM, Perez, Mateo, Schewe, S ORCID: 0000-0002-9093-9518, Somenzi, Fabio, Trivedi, Ashutosh and Wojtczak, DK ORCID: 0000-0001-5560-0546
(2019)
Omega-Regular Objectives in Model-Free Reinforcement Learning.
In: 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2019-4-6 - 2019-4-11, Prague, Czech Republic.
Text
tacas.pdf - Author Accepted Manuscript Download (427kB) |
Abstract
We provide the first solution for model-free reinforcement learning of ω-regular objectives for Markov decision processes (MDPs). We present a constructive reduction from the almost-sure satisfaction of ω-regular bjectives to an almost-sure reachability problem, and extend this technique to learning how to control an unknown model so that the chance of satisfying the objective is maximized. We compile ω-regular properties into limit-deterministic B¨uchi automata instead of the traditional Rabin automata; this choice sidesteps difficulties that have marred previous proposals. Our approach allows us to apply model-free, off-the-shelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP. We present an experimental evaluation of our technique on benchmark learning problems.
Item Type: | Conference or Workshop Item (Unspecified) |
---|---|
Uncontrolled Keywords: | Basic Behavioral and Social Science, Complementary and Integrative Health, Behavioral and Social Science |
Depositing User: | Symplectic Admin |
Date Deposited: | 20 Feb 2019 14:58 |
Last Modified: | 15 Mar 2024 00:25 |
DOI: | 10.1007/978-3-030-17462-0_27 |
Open Access URL: | https://link.springer.com/chapter/10.1007%2F978-3-... |
Related URLs: | |
URI: | https://livrepository.liverpool.ac.uk/id/eprint/3033069 |