General-Sum Multi-Agent Continuous Inverse Optimal Control



Neumeyer, C, Oliehoek, F ORCID: 0000-0003-4372-5055 and Gavrila, D
(2021) General-Sum Multi-Agent Continuous Inverse Optimal Control. IEEE Robotics and Automation Letters, 6 (2). pp. 3429-3436.

[img] Text
Neumeyer21RAL.pdf - Author Accepted Manuscript

Download (395kB) | Preview

Abstract

IEEE Modelling possible future outcomes of robot-human interactions is of importance in the intelligent vehicle and mobile robotics domains. Knowing the reward function that explains the observed behaviour of a human agent is advantageous for modelling the behaviour with Markov Decision Processes (MDPs). However, learning the rewards that determine the observed actions from data is complicated by interactions. We present a novel inverse reinforcement learning(IRL) algorithm that can infer the reward function in multi-agent interactive scenarios. In particular, the agents may act boundedly rational (i.e., sub-optimal), a characteristic that is typical for human decision making. Additionally, every agent optimizes its own reward function which makes it possible to address non-cooperative setups. In contrast to other methods, the algorithm does not rely on reinforcement learning during inference of the parameters of the reward function. We demonstrate that our proposed method accurately infers the ground truth reward function in two-agent interactive experiments.

Item Type: Article
Uncontrolled Keywords: Inverse reinforcement learning, learning from demonstration, reinforcement learning
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 17 Mar 2021 14:41
Last Modified: 18 Jan 2023 22:56
DOI: 10.1109/LRA.2021.3060411
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3117532