Document Ranking for Curated Document Databases Using BERT and Knowledge Graph Embeddings: Introducing GRAB-Rank



Muhammad, Iqra, Bollegala, Danushka ORCID: 0000-0003-4476-7003, Coenen, Frans ORCID: 0000-0003-1026-6649, Gamble, Carrol ORCID: 0000-0002-3021-1955, Kearney, Anna ORCID: 0000-0003-1404-3370 and Williamson, Paula ORCID: 0000-0001-9802-6636
(2021) Document Ranking for Curated Document Databases Using BERT and Knowledge Graph Embeddings: Introducing GRAB-Rank. .

[img] Text
dawakIqraMuhammad_2021.pdf - Author Accepted Manuscript

Download (683kB) | Preview

Abstract

Curated Document Databases (CDD) play an important role in helping researchers find relevant articles in scientific literature. Considerable recent attention has been given to the use of various document ranking algorithms to support the maintenance of CDDs. The typical approach is to represent the update document collection using a form of word embedding and to input this into a ranking model; the resulting document rankings can then be used to decide which documents should be added to the CDD and which should be rejected. The hypothesis considered in this paper is that a better ranking model can be produced if a hybrid embedding is used. To this end the Knowledge Graph And BERT Ranking (GRAB-Rank) approach is presented. The Online Resource for Recruitment research in Clinical trials (ORRCA) CDD was used as a focus for the work and as a means of evaluating the proposed technique. The GRAB-Rank approach is fully described and evaluated in the context of learning to rank for the purpose of maintaining CDDs. The evaluation indicates that the hypothesis is correct, hybrid embedding outperforms individual embeddings used in isolation. The evaluation also indicates that GRAB-Rank outperforms a traditional approach based on BM25 and a ngram-based SVR document ranking approach.

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: BERT, Knowledge graph concepts, Document ranking
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 05 Jul 2021 13:51
Last Modified: 23 Nov 2023 22:56
DOI: 10.1007/978-3-030-86534-4_10
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3128585