Maintaining Curated Document Databases Using a Learning to Rank Model: The ORRCA Experience



Muhammad, Iqra, Bollegala, Danushka ORCID: 0000-0003-4476-7003, Coenen, Frans ORCID: 0000-0003-1026-6649, Gamble, Carol ORCID: 0000-0002-3021-1955, Kearney, Anna ORCID: 0000-0003-1404-3370 and Williamson, Paula ORCID: 0000-0001-9802-6636
(2020) Maintaining Curated Document Databases Using a Learning to Rank Model: The ORRCA Experience. .

[img] Text
bcsSGAI_AI2020_IQ.pdf - Author Accepted Manuscript

Download (1MB) | Preview

Abstract

Curated Document Databases play a critical role in helping researchers find relevant articles in available literature. One such database is the ORRCA (Online Resource for Recruitment research in Clinical trials) database. The ORRCA database brings together published work in the field of clinical trials recruitment research into a single searchable collection. Document databases, such as ORRCA, require year-on-year updating as further relevant documents become available on a continuous basis. The updating of curated databases is a labour intensive and time consuming task. Machine learning techniques can help to automate the update process and reduce the workload needed for screening articles for inclusion. This paper presents an automated approach to the updating of ORRCA documents repository. The proposed automated approach is a learning to rank model. The approach is evaluated using the documents in the ORRCA database. Data from the ORRCA original systematic review was used to train the learning to rank model, and data from the ORRCA 2015 and 2017 updates was used to evaluate performance of the model. The evaluation demonstrated that significant resource savings can be made using the proposed approach.

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: 2 Aetiology, 2.6 Resources and infrastructure (aetiology)
Depositing User: Symplectic Admin
Date Deposited: 16 Sep 2020 10:29
Last Modified: 14 Mar 2024 19:01
DOI: 10.1007/978-3-030-63799-6_26
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3101268