Mungojerrie: Linear-Time Objectives in Model-Free Reinforcement Learning

Hahn, Ernst Moritz ORCID: 0000-0002-9348-7684, Perez, Mateo ORCID: 0000-0003-4220-3212, Schewe, Sven ORCID: 0000-0002-9093-9518, Somenzi, Fabio ORCID: 0000-0002-2085-2003, Trivedi, Ashutosh ORCID: 0000-0001-9346-0126 and Wojtczak, Dominik ORCID: 0000-0001-5560-0546 (2023) Mungojerrie: Linear-Time Objectives in Model-Free Reinforcement Learning. In: Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, 13993 . Springer Nature Switzerland, pp. 527-545. ISBN 9783031308222

Access the full-text of this item by clicking on the Open Access link.

Official URL: http://dx.doi.org/10.1007/978-3-031-30823-9_27

Abstract

<jats:title>Abstract</jats:title><jats:p>Mungojerrie is an extensible tool that provides a framework to translate linear-time objectives into reward for reinforcement learning (RL). The tool provides convergent RL algorithms for stochastic games, reference implementations of existing reward translations for <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-regular objectives, and an internal probabilistic model checker for <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-regular objectives. This functionality is modular and operates on shared data structures, which enables fast development of new translation techniques. Mungojerrie supports finite models specified in PRISM and <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-automata specified in the HOA format, with an integrated command line interface to external linear temporal logic translators. Mungojerrie is distributed with a set of benchmarks for <jats:inline-formula><jats:alternatives><jats:tex-math>$$\omega $$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>ω</mml:mi> </mml:math></jats:alternatives></jats:inline-formula>-regular objectives in RL.</jats:p>

Item Type:	Book Section
Uncontrolled Keywords:	Basic Behavioral and Social Science, Complementary and Integrative Health, Behavioral and Social Science
Divisions:	Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User:	Symplectic Admin
Date Deposited:	20 Jul 2023 12:44
Last Modified:	15 Mar 2024 00:24
DOI:	10.1007/978-3-031-30823-9_27
Open Access URL:	https://link.springer.com/chapter/10.1007/978-3-03...
Related URLs:	Publisher
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3171810