Hahn, EM, Perez, M, Schewe, S ORCID: 0000-0002-9093-9518, Somenzi, F, Trivedi, A and Wojtczak, D ORCID: 0000-0001-5560-0546
(2020)
Model-free reinforcement learning for stochastic parity games.
In: CONCUR 2020, 2020-9-1 - 2020-9-4.
Text
Model_Free_Reinforcement_Learning_for_Stochastic_Parity_Games.pdf - Author Accepted Manuscript Download (578kB) | Preview |
Abstract
This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 112-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions.
Item Type: | Conference or Workshop Item (Unspecified) |
---|---|
Depositing User: | Symplectic Admin |
Date Deposited: | 17 Jul 2020 10:35 |
Last Modified: | 08 Jun 2024 13:06 |
DOI: | 10.4230/LIPIcs.CONCUR.2020.21 |
URI: | https://livrepository.liverpool.ac.uk/id/eprint/3094325 |