Unsupervised machine learning identifies distinct ALS molecular subtypes in post-mortem motor cortex and blood expression data



Marriott, H, Kabiljo, R, Hunt, GP, Khleifat, AA, Jones, A, Troakes, C, Pfaff, AL, Quinn, JP ORCID: 0000-0003-3551-7803, Koks, S ORCID: 0000-0001-6087-6643, Dobson, RJ
et al (show 3 more authors) (2023) Unsupervised machine learning identifies distinct ALS molecular subtypes in post-mortem motor cortex and blood expression data Acta Neuropathologica Communications, 11 (1). 208-. ISSN 2051-5960, 2051-5960

Access the full-text of this item by clicking on the Open Access link.

Abstract

Amyotrophic lateral sclerosis (ALS) displays considerable clinical and genetic heterogeneity. Machine learning approaches have previously been utilised for patient stratification in ALS as they can disentangle complex disease landscapes. However, lack of independent validation in different populations and tissue samples have greatly limited their use in clinical and research settings. We overcame these issues by performing hierarchical clustering on the 5000 most variably expressed autosomal genes from motor cortex expression data of people with sporadic ALS from the KCL BrainBank (N = 112). Three molecular phenotypes linked to ALS pathogenesis were identified: synaptic and neuropeptide signalling, oxidative stress and apoptosis, and neuroinflammation. Cluster validation was achieved by applying linear discriminant analysis models to cases from TargetALS US motor cortex (N = 93), as well as Italian (N = 15) and Dutch (N = 397) blood expression datasets, for which there was a high assignment probability (80–90%) for each molecular subtype. The ALS and motor cortex specificity of the expression signatures were tested by mapping KCL BrainBank controls (N = 59), and occipital cortex (N = 45) and cerebellum (N = 123) samples from TargetALS to each cluster, before constructing case-control and motor cortex-region logistic regression classifiers. We found that the signatures were not only able to distinguish people with ALS from controls (AUC 0.88 ± 0.10), but also reflect the motor cortex-based disease process, as there was perfect discrimination between motor cortex and the other brain regions. Cell types known to be involved in the biological processes of each molecular phenotype were found in higher proportions, reinforcing their biological interpretation. Phenotype analysis revealed distinct cluster-related outcomes in both motor cortex datasets, relating to disease onset and progression-related measures. Our results support the hypothesis that different mechanisms underpin ALS pathogenesis in subgroups of patients and demonstrate potential for the development of personalised treatment approaches. Our method is available for the scientific and clinical community at https://alsgeclustering.er.kcl.ac.uk .

Item Type: Article
Uncontrolled Keywords: Amyotrophic lateral sclerosis, Unsupervised and supervised machine learning, Precision medicine, Transcriptomics, Patient stratification, Biomarkers
Divisions: Faculty of Health & Life Sciences
Faculty of Health & Life Sciences > Inst. Systems, Molec & Integrative Biology > Inst. Systems, Molec & Integrative Biology
Depositing User: Symplectic Admin
Date Deposited: 02 Jan 2024 15:51
Last Modified: 28 Feb 2026 16:11
DOI: 10.1186/s40478-023-01686-8
Open Access URL: https://rdcu.be/duk4r
Related Websites:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3177634
Disclaimer: The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate.