Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning



Laios, Alexandros, Katsenou, Angeliki, Tan, Yong Sheng, Johnson, Racheal, Otify, Mohamed ORCID: 0000-0002-7884-8680, Kaufmann, Angelika, Munot, Sarika, Thangavelu, Amudha, Hutson, Richard, Broadhead, Tim
et al (show 3 more authors) (2021) Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning. CANCER CONTROL, 28. 10732748211044678-.

[img] Text
Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learn.pdf - Published version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

Abstract

<h4>Introduction</h4>Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support the feature selection for the 2-year prognostic period and compared the performance of several Machine Learning prediction algorithms for accurate 2-year prognosis estimation in advanced-stage high grade serous ovarian cancer (HGSOC) patients.<h4>Methods</h4>The prognosis estimation was formulated as a binary classification problem. Dataset was split into training and test cohorts with repeated random sampling until there was no significant difference (p = 0.20) between the two cohorts. A ten-fold cross-validation was applied. Various state-of-the-art supervised classifiers were used. For feature selection, in addition to the exhaustive search for the best combination of features, we used the-chi square test of independence and the MRMR method.<h4>Results</h4>Two hundred nine patients were identified. The model's mean prediction accuracy reached 73%. We demonstrated that Support-Vector-Machine and Ensemble Subspace Discriminant algorithms outperformed Logistic Regression in accuracy indices. The probability of achieving a cancer-free state was maximised with a combination of primary cytoreduction, good performance status and maximal surgical effort (AUC 0.63). Standard chemotherapy, performance status, tumour load and residual disease were consistently predictive of the mid-term overall survival (AUC 0.63-0.66). The model recall and precision were greater than 80%.<h4>Conclusion</h4>Machine Learning appears to be promising for accurate prognosis estimation. Appropriate feature selection is required when building an HGSOC model for 2-year prognosis prediction. We provide evidence as to what combination of prognosticators leads to the largest impact on the HGSOC 2-year prognosis.

Item Type: Article
Uncontrolled Keywords: ovarian cancer, cytoreduction, prognosis estimation, clinical factor analysis, predictive factors, Machine Learning
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Clinical Directorate
Depositing User: Symplectic Admin
Date Deposited: 08 Dec 2022 10:49
Last Modified: 18 Jan 2023 19:41
DOI: 10.1177/10732748211044678
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3166554