Bayesian Inference for Supervised Machine Learning: Algorithms and Applications



Byrnes, Paul
(2019) Bayesian Inference for Supervised Machine Learning: Algorithms and Applications. PhD thesis, University of Liverpool.

[img] Text
201136631_Aug2019.pdf - Unspecified

Download (9MB) | Preview

Abstract

Advances made in computer development along with the curiosity regarding the use of data in the world around us has resulted in machine learning becoming an area of much interest in recent decades. Its capabilities in automating processes such as face recognition at airport security or self-driving vehicles has highlighted the potential positive influence it could have on society. Behind many of these processes are statistical models which identify patterns in data sets to allow for a decision making process to be formed. However, such models require the computation of unknown parameters which directly impact their predictive capabilities. This dissertation explores the development and application of Bayesian inference frameworks suitable for parameter identification for supervised machine learning methods. A recent analogy has opened up the possibility of interpreting Bayesian inference as rare event simulation. Bayesian Updating with Structural reliability methods (BUS), exploits the low acceptance rate in rejection based sampling, allowing for techniques from reliability analysis to solve the Bayesian updating task. A key principle for the BUS framework in terms of sample quality and sampler efficiency is the question of the termination of simulation. Currently, this is done through the use of a computationally expensive automatic stopping condition. To improve computational efficiency, two new stopping criteria are introduced. Aside from this reduction in cost, the proposed approaches not only simplify the implementation of the framework for the practitioner in terms of coding and theoretical understanding but also offer statistical guarantees of sampling from the correct distribution. With the emergence of large data sets has come the need for scalable algorithms which offer efficient solutions. To improve the suitability of BUS to such tasks, Support Vector Machines (SVM) are integrated into the BUS approach to allow for a reduction in total model evaluations in the presence of a large number of data observations. Additionally, the capabilities of the methods developed during this dissertation are illustrated on two real life breast cancer classification tasks. The first concerning the identification of cancerous tissue in biopsy samples and the second the identification of relapse rates from patient molecular data. Aside from the suitability of the Bayesian inference frameworks to such problems, the potential of supervised machine learning in improving the diagnosis process for cancer patients is also discussed.

Item Type: Thesis (PhD)
Uncontrolled Keywords: Bayesian Inference, Markov Chain Monte Carlo, MAchine Learning, Cancer Diagnosis
Divisions: Faculty of Science and Engineering > School of Engineering
Depositing User: Symplectic Admin
Date Deposited: 08 Jan 2020 09:31
Last Modified: 01 Jan 2024 02:30
DOI: 10.17638/03061927
Supervisors:
  • Diaz De La O, Francisco
URI: https://livrepository.liverpool.ac.uk/id/eprint/3061927