An Impossibility Result in Automata-Theoretic Reinforcement Learning



Hahn, Ernst Moritz, Trivedi, Ashutosh, Perez, Mateo, Somenzi, Fabio, Schewe, Sven ORCID: 0000-0002-9093-9518 and Wojtczak, Dominik ORCID: 0000-0001-5560-0546
(2022) An Impossibility Result in Automata-Theoretic Reinforcement Learning. In: ATVA22.

[img] Text
paper_10.pdf - Author Accepted Manuscript

Download (833kB) | Preview

Abstract

The expanding role of reinforcement learning (RL) in safety-critical system design has promoted ω -automata as a way to express learning requirements—often non-Markovian—with greater ease of expression and interpretation than scalar reward signals. When ω -automata were first proposed in model-free RL, deterministic Rabin acceptance conditions were used in an attempt to provide a direct translation from ω -automata to finite state “reward” machines defined over the same automaton structure (a memoryless reward translation). While these initial attempts to provide faithful, memoryless reward translations for Rabin acceptance conditions remained unsuccessful, translations were discovered for other acceptance conditions such as suitable, limit-deterministic Büchi acceptance or more generally, good-for-MDP Büchi acceptance conditions. Yet, the question “whether a memoryless translation of Rabin conditions to scalar rewards exists” remained unresolved. This paper presents an impossibility result implying that any attempt to use Rabin automata directly (without extra memory) for model-free RL is bound to fail. To establish this result, we show a link between a class of automata enabling memoryless reward translation to closure properties of its accepting and rejecting infinity sets, and to the insight that both the property and its complement need to allow for positional strategies for such an approach to work. We believe that such impossibility results will provide foundations for the application of RL to safety-critical systems.

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: Basic Behavioral and Social Science, Behavioral and Social Science
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 03 Aug 2022 14:55
Last Modified: 15 Mar 2024 00:24
DOI: 10.1007/978-3-031-19992-9_3
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3160279