Mungojerrie: Linear-Time Objectives in Model-Free Reinforcement Learning



Hahn, Ernst Moritz ORCID: 0000-0002-9348-7684, Perez, Mateo ORCID: 0000-0003-4220-3212, Schewe, Sven ORCID: 0000-0002-9093-9518, Somenzi, Fabio ORCID: 0000-0002-2085-2003, Trivedi, Ashutosh ORCID: 0000-0001-9346-0126 and Wojtczak, Dominik ORCID: 0000-0001-5560-0546
(2023) Mungojerrie: Linear-Time Objectives in Model-Free Reinforcement Learning. In: Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, 13993 . Springer Nature Switzerland, pp. 527-545. ISBN 9783031308222

Access the full-text of this item by clicking on the Open Access link.

Abstract

<jats:title>Abstract</jats:title><jats:p>Mungojerrie is an extensible tool that provides a framework to translate linear-time objectives into reward for reinforcement learning (RL). The tool provides convergent RL algorithms for stochastic games, reference implementations of existing reward translations for <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-regular objectives, and an internal probabilistic model checker for <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-regular objectives. This functionality is modular and operates on shared data structures, which enables fast development of new translation techniques. Mungojerrie supports finite models specified in PRISM and <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-automata specified in the HOA format, with an integrated command line interface to external linear temporal logic translators. Mungojerrie is distributed with a set of benchmarks for <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-regular objectives in RL.</jats:p>

Item Type: Book Section
Uncontrolled Keywords: Basic Behavioral and Social Science, Complementary and Integrative Health, Behavioral and Social Science
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 20 Jul 2023 12:44
Last Modified: 15 Mar 2024 00:24
DOI: 10.1007/978-3-031-30823-9_27
Open Access URL: https://link.springer.com/chapter/10.1007/978-3-03...
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3171810