Transience in Countable MDPs



Kiefer, Stefan, Mayr, Richard, Shirmohammadi, Mahsa and Totzke, Patrick ORCID: 0000-0001-5274-8190
(2020) Transience in Countable MDPs.

[img] Text
2012.13739v3.pdf - Published version

Download (919kB) | Preview

Abstract

The Transience objective is not to visit any state infinitely often. While this is not possible in finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transience in countably infinite MDPs. 1. There exist uniformly $\epsilon$-optimal MD strategies (memoryless deterministic) for Transience, even in infinitely branching MDPs. 2. Optimal strategies for Transience need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3. If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., $\epsilon$-optimal strategies for Safety and co-B\"uchi and optimal strategies for $\{0,1,2\}$-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.

Item Type: Article
Uncontrolled Keywords: math.PR, math.PR, cs.FL, cs.GT, math.OC
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 02 Aug 2021 10:32
Last Modified: 18 Jan 2023 21:35
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3131090