An automation framework for clinical codelist development validated with UK data from patients with multiple long-term conditions



Aslam, A, Walker, L, Abaho, M, Cant, H, O’Connell, M, Abuzour, AS, Hama, L, Schofield, P ORCID: 0000-0002-6398-2537, Mair, FS, Ruddle, RA
et al (show 9 more authors) (2025) An automation framework for clinical codelist development validated with UK data from patients with multiple long-term conditions BMC Medical Research Methodology, 25 (1). 138-. ISSN 1471-2288, 1471-2288

[thumbnail of An Automation Framework for Clinical Codelist Development Validated with UK Data from Patients with Multiple Long-term Conditions - medRxiv 2025.pdf] Text
An Automation Framework for Clinical Codelist Development Validated with UK Data from Patients with Multiple Long-term Conditions - medRxiv 2025.pdf - Submitted version

Download (4MB) | Preview

Abstract

Background: Codelists play a crucial role in ensuring accurate and standardized communication within healthcare. However, preparation of high-quality codelists is a rigorous and time-consuming process. The literature focuses on transparency of clinical codelists and overlooks the utility of automation. Methods (Automated Framework Design and Use-case: DynAIRx): Here we present a Codelist Generation Framework that can automate generation of codelists with minimal input from clinical experts. We demonstrate the process using a specific project, DynAIRx, producing appropriate codelists and a framework allowing future projects to take advantage of automated codelist generation. Both the framework and codelist are publicly available. DynAIRx is an NIHR-funded project aiming to develop AIs to help optimise prescribing of medicines in patients with multiple long-term conditions. DynAIRx requires complex codelists to describe the trajectory of each patient, and the interaction between their conditions. We promptly generated ≈214 codelists for DynAIRx using the proposed framework and validated them with a panel of experts, significantly reducing the amount of time required by making effective use of automation. Results: The framework reduced the clinician time required to validate codes, automatically shrunk codelists using trusted sources and added new codes for review against existing codelists. In the DynAIRx case study, a codelist of ≈14000 codes required only 7-9 hours of clinician’s time in the end (while existing methods takes months), and application of the automation framework reduced the workload by >80%. Conclusion: This work examines current methodologies for codelist development and the challenges associated with ensuring transparency and reproducibility. A key benefit of this approach is its emphasis on automation and reliance on trusted sources, which significantly lowers the workload, minimizes human error, and saves substantial time, particularly the time needed from clinical experts.

Item Type: Article
Uncontrolled Keywords: Codelist, Automation, Multiple long term conditions (MLTC), SNOMEDs, DynAIRx
Divisions: Faculty of Health & Life Sciences
Faculty of Health & Life Sciences > Inst. Population Health
Depositing User: Symplectic Admin
Date Deposited: 02 Jun 2025 10:47
Last Modified: 28 Feb 2026 11:42
DOI: 10.1186/s12874-025-02541-1
Related Websites:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3193007
Disclaimer: The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate.