Evaluation Techniques for the Applications of Machine Learning in Cybersecurity



Alriyami, Said
(2022) Evaluation Techniques for the Applications of Machine Learning in Cybersecurity. PhD thesis, University of Liverpool.

[img] Text
200859423_Mar2022.pdf - Unspecified

Download (3MB) | Preview

Abstract

The number of internet users is on the rise and more and more parts of our lives depend on the internet. However, there is also an increase in online threats. The newly discovered malware are in millions every year. This makes the manual process of detecting and defending against the attacks harder, thus, we need a more automated way to do so. Machine learning (ML) can help the automation by finding zero-day attacks or new malware. The work presented in this thesis is concerned with the evaluation methods used to measure the performance of the machine learning applications for cybersecurity. The current evaluation methods do not take into consideration the rapid change nature of security applications. For that, we need other evaluation methods that give a deeper understanding of the ML model designed for the application of cybersecurity. There are the two most-known methods of evaluation: handout set and k-Fold cross-valuation. These methods are designed to give a general understanding of the model performance to select the best model that can perform well in the deployment stage. However, various types of tasks create various challenges for these methods. In some cases, we needed more specialised methods. First, we need to check if the result generated by the model is just a matter of overfitting because of the data generated from the same environment, for example, an overfitting by the computer network architecture. Second, the threats to cybersecurity are consistently changing. Therefore, we need to measure how much we need to update our model, how often, and what kind of threats there are—for example, the constant changes in malware. Finally, some problems can generate an almost infinite amount of data. Therefore, we need to look for different evaluation methods for these problems such as cryptography problems. In this work, we propose different ways of tackling these problems. For the first type, we propose the use of the Cross-Datasets Evaluation method. We will take an example from Network Intrusion Detection System (NIDS). The data will be generated from two different computer networks. We propose five different ways of splitting the data based on the duration (time) and data size for the second type. We will use the method with malware detection and propose a new deep learning model for that. For the last type, we propose the use of a generator for the evaluation. We will take the RSA algorithm security as an example. By proposing those kinds of methods, we have a better and more practical way of evaluating the cybersecurity applications of machine learning.

Item Type: Thesis (PhD)
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 06 Sep 2022 07:56
Last Modified: 01 Aug 2023 01:30
DOI: 10.17638/03150759
Supervisors:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3150759