Knowledge Base Enrichment by Relation Learning from Social Tagging Data



Dong, Hang ORCID: 0000-0001-6828-6891, Wang, Wei, Coenen, Frans ORCID: 0000-0003-1026-6649 and Huang, Kaizhu
(2020) Knowledge Base Enrichment by Relation Learning from Social Tagging Data. Information Sciences, 526. pp. 203-220.

This is the latest version of this item.

[img] Text
Hang-INS-preprint-accepted-manuscript 30 Mar.pdf - Author Accepted Manuscript

Download (1MB) | Preview

Abstract

There has been considerable interest in transforming unstructured social tagging data into structured knowledge for semantic-based retrieval and recommendation. Research in this line mostly exploits data co-occurrence and often overlooks the complex and ambiguous meanings of tags. Furthermore, there have been few comprehensive evaluation studies regarding the quality of the discovered knowledge. We propose a supervised learning method to discover subsumption relations from tags. The key to this method is quantifying the probabilistic association among tags to better characterise their relations. We further develop an algorithm to organise tags into hierarchies based on the learned relations. Experiments were conducted using a large, publicly available dataset, Bibsonomy, and three popular, human-engineered or data-driven knowledge bases: DBpedia, Microsoft Concept Graph, and ACM Computing Classification System. We performed a comprehensive evaluation using different strategies: relation-level, ontology-level, and knowledge base enrichment based evaluation. The results clearly show that the proposed method can extract knowledge of better quality than the existing methods against the gold standard knowledge bases. The proposed approach can also enrich knowledge bases with new subsumption relations, having the potential to significantly reduce time and human effort for knowledge base maintenance and ontology evolution.

Item Type: Article
Uncontrolled Keywords: Knowledge discovery, Knowledge base enrichment, Ontology learning, Social tagging, Probabilistic association analysis, Classification
Depositing User: Symplectic Admin
Date Deposited: 24 Apr 2020 12:59
Last Modified: 18 Jan 2023 23:54
DOI: 10.1016/j.ins.2020.04.002
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3084045

Available Versions of this Item