Developing prediction models to estimate the risk of two survival outcomes both occurring: A comparison of techniques



Pate, Alexander, Sperrin, Matthew, Riley, Richard D, Sergeant, Jamie C, Van Staa, Tjeerd, Peek, Niels, Mamas, Mamas A, Lip, Gregory YH ORCID: 0000-0002-7566-1626, O'Flaherty, Martin ORCID: 0000-0001-8944-4131, Buchan, Iain ORCID: 0000-0003-3392-1650
et al (show 1 more authors) (2023) Developing prediction models to estimate the risk of two survival outcomes both occurring: A comparison of techniques. STATISTICS IN MEDICINE, 42 (18). pp. 3184-3207.

[img] XML Word Processing Document (DOCX)
Manuscript - Developing multivariate prediction models to predict the joint risk 20230201 R1.docx - Author Accepted Manuscript

Download (304kB)

Abstract

<h4>Introduction</h4>This study considers the prediction of the time until two survival outcomes have both occurred. We compared a variety of analytical methods motivated by a typical clinical problem of multimorbidity prognosis.<h4>Methods</h4>We considered five methods: product (multiply marginal risks), dual-outcome (directly model the time until both events occur), multistate models (msm), and a range of copula and frailty models. We assessed calibration and discrimination under a variety of simulated data scenarios, varying outcome prevalence, and the amount of residual correlation. The simulation focused on model misspecification and statistical power. Using data from the Clinical Practice Research Datalink, we compared model performance when predicting the risk of cardiovascular disease and type 2 diabetes both occurring.<h4>Results</h4>Discrimination was similar for all methods. The product method was poorly calibrated in the presence of residual correlation. The msm and dual-outcome models were the most robust to model misspecification but suffered a drop in performance at small sample sizes due to overfitting, which the copula and frailty model were less susceptible to. The copula and frailty model's performance were highly dependent on the underlying data structure. In the clinical example, the product method was poorly calibrated when adjusting for 8 major cardiovascular risk factors.<h4>Discussion</h4>We recommend the dual-outcome method for predicting the risk of two survival outcomes both occurring. It was the most robust to model misspecification, although was also the most prone to overfitting. The clinical example motivates the use of the methods considered in this study.

Item Type: Article
Uncontrolled Keywords: clinical prediction model, multiple outcome, multivariate, simulation, survival analysis, time-to-event
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Institute of Life Courses and Medical Sciences
Faculty of Health and Life Sciences > Institute of Population Health
Depositing User: Symplectic Admin
Date Deposited: 10 Aug 2023 08:23
Last Modified: 10 Aug 2023 09:21
DOI: 10.1002/sim.9771
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3172165