Process Mining and Machine Learning for Intrusion Detection



Zhong, Yinzheng ORCID: 0000-0001-8477-3956
(2023) Process Mining and Machine Learning for Intrusion Detection. PhD thesis, University of Liverpool.

[img] Text
201057235_May2023.pdf - Author Accepted Manuscript

Download (19MB) | Preview

Abstract

With the increasing volume of internet traffic and the growth of the variety of internet services, the amount of cyber-attacks has increased vastly in recent years. Methods used to detect and prevent cyber-attacks are called intrusion detection systems. These systems prevent damage or compromise to the integrity, availability and confidentiality of infrastructures. However, the continuously increasing amount of data poses problems to the current intrusion detection methods. An intrusion detection system may suffer from a lack of efficiency, a lack of the ability to work with encrypted data and unable to find causal relationships between the cyber-attack and concurrent internet connections. The thesis introduces a novel algorithm that is developed to address some of the existing issues of current intrusion detection systems. This technique takes advantage of process mining in the encoding of event data. Process mining is designed to discover the process model from the event log automatically and analyse the generated model. The performance of using process mining for intrusion detection has been verified and analysed at the early stage of this research. Then the process mining algorithm was modified with the combination of online processing capabilities. The resulting algorithm is a feature generator that takes the event log as the input and outputs a sequence of matrices that is suitable for machine learning and other processing. The performance and efficiency of the feature generator have been verified with different datasets and machine learning algorithms. Results show that all the machine-learning algorithms that have been tested in classification yield accuracy that proves the generated feature can be used for intrusion detection. Verification has also been taken on anomaly detection approaches with various unsupervised machine learning algorithms, which further illustrate that the generated feature contains a higher abstraction of information of intrusions. The generation processing is efficient, and the processing speed is able to handle bandwidth in practical use.

Item Type: Thesis (PhD)
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 22 Aug 2023 13:14
Last Modified: 22 Aug 2023 13:15
DOI: 10.17638/03170752
Supervisors:
  • Alexei, Lisitsa
  • Yannis, Goulermas
URI: https://livrepository.liverpool.ac.uk/id/eprint/3170752