Integrating data and text mining processes for digital library applications



Sanderson, R and Watry, P ORCID: 0000-0001-9723-1646
(2007) Integrating data and text mining processes for digital library applications In: JCDL07: Joint Conference on Digital Libraries, New York.

[thumbnail of p73-sanderson.pdf] Text
p73-sanderson.pdf - Author Accepted Manuscript

Download (297kB)

Abstract

This paper explores the integration of text mining and data mining techniques, digital library systems, and computational and data grid technologies with the objective of developing an online classification service exemplar. We discuss the current research issues relating to the use of data mining algorithms and toolkits for textual data; the necessary changes within the Cheshire3 Information Framework to accommodate analysis workflows; the outcomes of a demonstrator based on the National Library of Medicine's Medline dataset; and the provision of comparable metrics for evaluation purposes. The prototype has resulted in extremely accurate online classification services and offers a novel method of supporting text mining and data mining within a highly scaled computational environment, integrated seamlessly into the digital library architecture. Copyright 2007 ACM.

Item Type: Conference Item (Unspecified)
Additional Information: ## TULIP Type: Conference Proceedings (contribution) ## issn: 978-1-59593-644-8
Uncontrolled Keywords: 4605 Data Management and Data Science, 46 Information and Computing Sciences, 4610 Library and Information Studies
Depositing User: Symplectic Admin
Date Deposited: 04 Dec 2017 09:25
Last Modified: 22 Jan 2026 19:48
DOI: 10.1145/1255175.1255188
Related Websites:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3013216
Disclaimer: The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate.