Missing data was handled inconsistently in UK prediction models: a review of method used



Tsvetanova, Antonia, Sperrin, Matthew, Peek, Niels, Buchan, Iain ORCID: 0000-0003-3392-1650, Hyland, Stephanie and Martin, Glen P
(2021) Missing data was handled inconsistently in UK prediction models: a review of method used. JOURNAL OF CLINICAL EPIDEMIOLOGY, 140. pp. 149-158.

[img] Text
CPM Review_Antonia Tsvetanova_JCEPI-S-21-00322.pdf - Author Accepted Manuscript

Download (3MB) | Preview

Abstract

<h4>Objectives</h4>No clear guidance exists on handling missing data at each stage of developing, validating and implementing a clinical prediction model (CPM). We aimed to review the approaches to handling missing data that underly the CPMs currently recommended for use in UK healthcare.<h4>Study design and setting</h4>A descriptive cross-sectional meta-epidemiological study aiming to identify CPMs recommended by the National Institute for Health and Care Excellence (NICE), which summarized how missing data is handled across their pipelines.<h4>Results</h4>A total of 23 CPMs were included through "sampling strategy." Six missing data strategies were identified: complete case analysis (CCA), multiple imputation, imputation of mean values, k-nearest neighbours imputation, using an additional category for missingness, considering missing values as risk-factor-absent. 52% of the development articles and 48% of the validation articles did not report how missing data were handled. CCA was the most common approach used for development (40%) and validation (44%). At implementation, 57% of the CPMs required complete data entry, whilst 43% allowed missing values. Three CPMs had consistent paths in their pipelines.<h4>Conclusion</h4>A broad variety of methods for handling missing data underly the CPMs currently recommended for use in UK healthcare. Missing data handling strategies were generally inconsistent. Better quality assurance of CPMs needs greater clarity and consistency in handling of missing data.

Item Type: Article
Uncontrolled Keywords: Statistical models, Prognosis, Predictive medicine, Missing data, Imputation, Missing data handling approaches
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Institute of Population Health
Depositing User: Symplectic Admin
Date Deposited: 23 Feb 2022 09:43
Last Modified: 18 Jan 2023 21:11
DOI: 10.1016/j.jclinepi.2021.09.008
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3149482