Mapping tweets to a known disease epidemiology; a case study of Lyme disease in the United Kingdom and Republic of Ireland.



Tulloch, John SP, Vivancos, Roberto, Christley, Rob M, Radford, Alan D ORCID: 0000-0002-4590-1334 and Warner, Jenny C
(2019) Mapping tweets to a known disease epidemiology; a case study of Lyme disease in the United Kingdom and Republic of Ireland. Journal of biomedical informatics, 100S. 100060 - ?.

[img] Text
Manuscript_3.docx - Accepted Version

Download (91kB)

Abstract

<h4>Background</h4>Analysis of social media is an emerging method with potential as a tool for disease surveillance. Twitter may offer a route for surveillance by using tweeting habits as a proxy for disease incidence. Previous work has focused on temporal patterns and have proven to be successful. However, the identification of geographical patterns has been limited by a combination of Twitter's data collection policies and by exploring diseases that have a high prevalence and high levels of awareness with the public. We propose that, by performing a restricted geographical search strategy on a disease with a relatively low incidence, one may be able to explore spatial patterns. Here, Lyme disease in the United Kingdom and the Republic of Ireland is used as a case example.<h4>Objective</h4>To explore whether the tweeting habits of British and Irish Twitter users matched the known spatio-temporal epidemiology of Lyme disease in these respective countries.<h4>Methods</h4>All Tweets containing the word 'Lyme' were collected between the 1st of July 2017 and the 30th June 2018, restricted by geography (a 375-mile radius around the geographical centre of Great Britain) and by language (English-only tweets). Tweets were removed which referred to locations that included 'Lyme' within their name (e.g. Lyme Regis). Only original tweets were analysed. Daily and monthly time series were created and compared to national Lyme disease surveillance figures. A map of the number of Twitter users tweeting about Lyme disease per 100,000 population per local authority was created. This was formerly compared to national surveillance data for England and Wales using an exploratory spatial data analysis approach.<h4>Results</h4>During the study period, 13,757 original tweets containing the word 'Lyme', and excluding place names relating to Lyme, were collected. The mean number of daily tweets was 38 (range: 12-276). There was strong seasonality with the highest number of tweets in the summer, this matched the known epidemiology of Lyme disease. Of the 5212 of users who tweeted about Lyme disease, 51.8% had a user profile location that could be matched to a local authority in the United Kingdom or Republic of Ireland. The mean number of Twitter users tweeting about Lyme disease per 100,000 population per local authority was 3.7. The areas with the highest incidence were south-west England and the Highlands of Scotland. When comparing these figures to English and Welsh Lyme disease surveillance figures they showed a significant positive spatial correlation (p = 0.002).<h4>Conclusions</h4>The tempo-spatial pattern of Twitter users tweeting about Lyme disease matches the known disease epidemiology. The degree of geographical concordance between Twitter users' locations and national surveillance reports, indicate that Twitter has the potential to be used in to identify potential disease hotspots based on the levels of social media 'noise'. There is scope for further work to test the robustness of Twitter as an adjunct 'measure of concern' disease surveillance tool. However, caution must be taken as national media stories can skew data and Twitter users may not provide reliable facts in the data that they share on the platform.

Item Type: Article
Depositing User: Symplectic Admin
Date Deposited: 09 Oct 2019 08:13
Last Modified: 29 Aug 2022 18:10
DOI: 10.1016/j.yjbinx.2019.100060
URI: https://livrepository.liverpool.ac.uk/id/eprint/3057564