Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven ORCID: 0000-0002-9093-9518, Somenzi, Fabio, Trivedi, Ashutosh and Wojtczak, Dominik ORCID: 0000-0001-5560-0546
(2022)
Reinforcement Learning with Guarantees That Hold for Ever.
.
Text
paper_inv-2.pdf - Author Accepted Manuscript Download (665kB) | Preview |
Abstract
Reinforcement learning is a successful explore-and-exploit approach, where a controller tries to learn how to navigate an unknown environment. The principle approach is for an intelligent agent to learn how to maximise expected rewards. But what happens if the objective refers to non-terminating systems? We can obviously not wait until an infinite amount of time has passed, assess the success, and update. But what can we do? This talk will tell.
Item Type: | Conference or Workshop Item (Unspecified) |
---|---|
Uncontrolled Keywords: | Behavioral and Social Science |
Divisions: | Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science |
Depositing User: | Symplectic Admin |
Date Deposited: | 05 Aug 2022 11:24 |
Last Modified: | 15 Mar 2024 00:24 |
DOI: | 10.1007/978-3-031-15008-1_1 |
Related URLs: | |
URI: | https://livrepository.liverpool.ac.uk/id/eprint/3160444 |