Integrating data and text mining processes for digital library applications



Sanderson, Robert and Watry, Paul ORCID: 0000-0001-9723-1646
(2007) Integrating data and text mining processes for digital library applications. In: JCDL07: Joint Conference on Digital Libraries, New York.

[img] Text
p73-sanderson.pdf - Author Accepted Manuscript

Download (297kB)

Abstract

This paper explores the integration of text mining and data mining techniques, digital library systems, and computational and data grid technologies with the objective of developing an online classification service exemplar. We discuss the current research issues relating to the use of data mining algorithms and toolkits for textual data; the necessary changes within the Cheshire3 Information Framework to accommodate analysis workflows; the outcomes of a demonstrator based on the National Library of Medicine's Medline dataset; and the provision of comparable metrics for evaluation purposes. The prototype has resulted in extremely accurate online classification services and offers a novel method of supporting text mining and data mining within a highly scaled computational environment, integrated seamlessly into the digital library architecture. Copyright 2007 ACM.

Item Type: Conference or Workshop Item (Unspecified)
Additional Information: ## TULIP Type: Conference Proceedings (contribution) ## issn: 978-1-59593-644-8
Uncontrolled Keywords: Networking and Information Technology R&D (NITRD), Bioengineering
Depositing User: Symplectic Admin
Date Deposited: 04 Dec 2017 09:25
Last Modified: 15 Mar 2024 07:20
DOI: 10.1145/1255175.1255188
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3013216