Reinforcement Learning with Guarantees That Hold for Ever

Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven ORCID: 0000-0002-9093-9518, Somenzi, Fabio, Trivedi, Ashutosh and Wojtczak, Dominik ORCID: 0000-0001-5560-0546 (2022) Reinforcement Learning with Guarantees That Hold for Ever. .

Text
paper_inv-2.pdf - Author Accepted Manuscript
Download (665kB) | Preview

Official URL: http://dx.doi.org/10.1007/978-3-031-15008-1_1

Abstract

Reinforcement learning is a successful explore-and-exploit approach, where a controller tries to learn how to navigate an unknown environment. The principle approach is for an intelligent agent to learn how to maximise expected rewards. But what happens if the objective refers to non-terminating systems? We can obviously not wait until an infinite amount of time has passed, assess the success, and update. But what can we do? This talk will tell.

Item Type:	Conference or Workshop Item (Unspecified)
Uncontrolled Keywords:	Behavioral and Social Science
Divisions:	Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User:	Symplectic Admin
Date Deposited:	05 Aug 2022 11:24
Last Modified:	15 Mar 2024 00:24
DOI:	10.1007/978-3-031-15008-1_1
Related URLs:	Author Publisher
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3160444