Scharpff, Joris, Roijers, Diederik M, Oliehoek, Frans ORCID: 0000-0003-4372-5055, Spaan, Matthijs TJ and De Weerdt, Matthijs
(2015)
Solving Multi-agent MDPs Optimally with Conditional Return Graphs.
In: The Tenth AAMAS Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM), Istanbul, Turkey.
This is the latest version of this item.
Text
Scharpff15MSDM.pdf - Unspecified Access to this file is embargoed until Unspecified. Download (594kB) |
Abstract
In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate in order find an optimal joint policy that maximises joint value. Typical solution al- gorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP setting (MMDP) such structure is not present. We propose a new optimal solver for so-called TI-MMDPs, where agents can only af- fect their local state, while their value may depend on the state of others. We decompose the returns into local returns per agent that we represent compactly in a conditional re- turn graph (CRG). Using CRGs the value of a joint policy as well as bounds on the value of partially specified joint policies can be efficiently computed. We propose CoRe, a novel branch-and-bound policy search algorithm building on CRGs. CoRe typically requires less runtime than the avail- able alternatives and is able to find solutions to problems previously considered unsolvable.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | ?? QA75 ?? |
Depositing User: | Symplectic Admin |
Date Deposited: | 26 Nov 2015 09:20 |
Last Modified: | 17 Dec 2022 01:30 |
Related URLs: | |
URI: | https://livrepository.liverpool.ac.uk/id/eprint/2039199 |
Available Versions of this Item
-
Solving Multi-agent MDPs Optimally with Conditional Return Graphs. (deposited 20 Oct 2015 07:38)
- Solving Multi-agent MDPs Optimally with Conditional Return Graphs. (deposited 26 Nov 2015 09:20) [Currently Displayed]