PUblications Metadata Augmentation (PUMA) pipeline



Butters, Oliver W ORCID: 0000-0003-0354-8461, Wilson, Rebecca C ORCID: 0000-0003-2294-593X, Garner, Hugh and Burton, Thomas WY
(2020) PUblications Metadata Augmentation (PUMA) pipeline. F1000Research, 9. p. 1095.

Access the full-text of this item by clicking on the Open Access link.
[img] Text
2e174707-46f9-44eb-8464-907bc8fe0df6_25484_-_olly_butters.pdf - Published version

Download (1MB) | Preview

Abstract

Cohort studies collect, generate and distribute data over long periods of time – often over the lifecourse of their participants. It is common for these studies to host a list of publications (which can number many thousands) on their website to demonstrate the impact of the study and facilitate the search of existing research to which the study data has contributed. The ability to search and explore these publication lists varies greatly between studies. We believe a lack of rich search and exploration functionality is a barrier to entry for new or prospective users of a study’s data, since it may be difficult to find and evaluate previous work in a given area. These lists of publications are also typically manually curated, resulting in a lack of rich metadata to analyse, making bibliometric analysis difficult. We present here a software pipeline that aggregates metadata from a variety of third-party providers to power a web based search and exploration tool for lists of publications. Alongside core publication metadata (i.e. author lists, keywords etc.), we include geocoding of first authors and citations in our pipeline. This allows a characterisation of a study as a whole based on common locations of authors, frequency of keywords, citation profile etc. This enriched publications metadata can be useful for generating project impact metrics and web-based graphics useful for public dissemination. In addition, the pipeline produces a research data set for bibliometric analysis or social studies of science.

Item Type: Article
Uncontrolled Keywords: Longitudinal birth cohort, Bibliography, Bibliometrics, ALSPAC
Depositing User: Symplectic Admin
Date Deposited: 17 Sep 2020 10:11
Last Modified: 15 Mar 2024 17:16
DOI: 10.12688/f1000research.25484.1
Open Access URL: http://doi.org/10.12688/f1000research.25484.1
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3101550