Sanderson, R and Watry, P
ORCID: 0000-0001-9723-1646
(2007)
Integrating data and text mining processes for digital library applications
In: JCDL07: Joint Conference on Digital Libraries, New York.
|
Text
p73-sanderson.pdf - Author Accepted Manuscript Download (297kB) |
Abstract
This paper explores the integration of text mining and data mining techniques, digital library systems, and computational and data grid technologies with the objective of developing an online classification service exemplar. We discuss the current research issues relating to the use of data mining algorithms and toolkits for textual data; the necessary changes within the Cheshire3 Information Framework to accommodate analysis workflows; the outcomes of a demonstrator based on the National Library of Medicine's Medline dataset; and the provision of comparable metrics for evaluation purposes. The prototype has resulted in extremely accurate online classification services and offers a novel method of supporting text mining and data mining within a highly scaled computational environment, integrated seamlessly into the digital library architecture. Copyright 2007 ACM.
| Item Type: | Conference Item (Unspecified) |
|---|---|
| Additional Information: | ## TULIP Type: Conference Proceedings (contribution) ## issn: 978-1-59593-644-8 |
| Uncontrolled Keywords: | 4605 Data Management and Data Science, 46 Information and Computing Sciences, 4610 Library and Information Studies |
| Depositing User: | Symplectic Admin |
| Date Deposited: | 04 Dec 2017 09:25 |
| Last Modified: | 22 Jan 2026 19:48 |
| DOI: | 10.1145/1255175.1255188 |
| Related Websites: | |
| URI: | https://livrepository.liverpool.ac.uk/id/eprint/3013216 |
| Disclaimer: | The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate. |
Altmetric
Altmetric