Multi-Agent Learning for Security and Sustainability

Klima, R (2019) Multi-Agent Learning for Security and Sustainability. PhD thesis, University of Liverpool.

Text
201147871_Oct2019.pdf - Unspecified
Download (4MB) | Preview

Abstract

This thesis studies the application of multi-agent learning in complex domains where safety and sustainability are crucial. We target some of the main obstacles in the deployment of multi-agent learning techniques in such domains. These obstacles consist of modelling complex environments with multi-agent interaction, designing robust learning processes and modelling adversarial agents. The main goal of using modern multi-agent learning methods is to improve the effectiveness of behaviour in such domains, and hence increase sustainability and security. This thesis investigates three complex real-world domains: space debris removal, critical domains with risky states and spatial security domains such as illegal rhino poaching. We first tackle the challenge of modelling a complex multi-agent environment. The focus is on the space debris removal problem, which poses a major threat to the sustainability of earth orbit. We develop a high-fidelity space debris simulator that allows us to simulate the future evolution of the space debris environment. Using the data from the simulator we propose a surrogate model, which enables fast evaluation of different strategies chosen by the space actors. We then analyse the dynamics of strategic decision making among multiple space actors, comparing different models of agent interaction: static vs. dynamic and centralised vs. decentralised. The outcome of our work can help future decision makers to design debris removal strategies, and consequently mitigate the threat of space debris. Next, we study how we can design a robust learning process in critical domains with risky states, where destabilisation of local components can lead to severe impact on the whole network. We propose a novel robust operator κ which can be combined with reinforcement learning methods, leading to learning safe policies, mitigating the threat of external attack, or failure in the system. Finally, we investigate the challenge of learning an effective behaviour while facing adversarial attackers in spatial security domains such as illegal rhino poaching. We assume that such attackers can be occasionally observed. Our approach consists of combining Bayesian inference with temporal difference learning, in order to build a model of the attacker behaviour. Our method can effectively use the partial observability of the attacker’s location and approximate the performance of a full observability case. This thesis therefore presents novel methods and tackles several important obstacles in deploying multi-agent learning algorithms in the real-world, which further narrows the reality gap between theoretical models and real-world applications.

Item Type:	Thesis (PhD)
Divisions:	Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User:	Symplectic Admin
Date Deposited:	23 Oct 2019 14:56
Last Modified:	19 Jan 2023 00:22
DOI:	10.17638/03058825
Supervisors:	Tuyls, Karl Savani, Rahul ORCID: 0000-0003-1262-7831
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3058825