Fabris, Fabio, Doherty, Aoife, Palmer, Daniel, de Magalhaes, Joao Pedro ORCID: 0000-0002-6363-2465 and Freitas, Alex A
(2018)
A new approach for interpreting Random Forest models and its application to the biology of ageing.
BIOINFORMATICS, 34 (14).
pp. 2449-2456.
Text
bty087_inpress.pdf - Published version Download (299kB) |
Abstract
<h4>Motivation</h4>This work uses the Random Forest (RF) classification algorithm to predict if a gene is over-expressed, under-expressed or has no change in expression with age in the brain. RFs have high predictive power, and RF models can be interpreted using a feature (variable) importance measure. However, current feature importance measures evaluate a feature as a whole (all feature values). We show that, for a popular type of biological data (Gene Ontology-based), usually only one value of a feature is particularly important for classification and the interpretation of the RF model. Hence, we propose a new algorithm for identifying the most important and most informative feature values in an RF model.<h4>Results</h4>The new feature importance measure identified highly relevant Gene Ontology terms for the aforementioned gene classification task, producing a feature ranking that is much more informative to biologists than an alternative, state-of-the-art feature importance measure.<h4>Availability and implementation</h4>The dataset and source codes used in this paper are available as 'Supplementary Material' and the description of the data can be found at: https://fabiofabris.github.io/bioinfo2018/web/.<h4>Supplementary information</h4>Supplementary data are available at Bioinformatics online.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Brain, Animals, Humans, Computational Biology, Gene Expression Regulation, Aging, Software, Gene Ontology, Machine Learning |
Depositing User: | Symplectic Admin |
Date Deposited: | 02 Mar 2018 09:20 |
Last Modified: | 23 Jan 2024 13:38 |
DOI: | 10.1093/bioinformatics/bty087 |
Related URLs: | |
URI: | https://livrepository.liverpool.ac.uk/id/eprint/3018526 |