A classification approach for detecting cross-lingual biomedical term translations



Hakami, H and Bollegala, D ORCID: 0000-0003-4476-7003
(2017) A classification approach for detecting cross-lingual biomedical term translations. NATURAL LANGUAGE ENGINEERING, 23 (1). pp. 31-51.

[img] Text
biotrans (1).pdf - Author Accepted Manuscript

Download (542kB)

Abstract

<jats:title>Abstract</jats:title><jats:p>Finding translations for technical terms is an important problem in machine translation. In particular, in highly specialized domains such as biology or medicine, it is difficult to find bilingual experts to annotate sufficient cross-lingual texts in order to train machine translation systems. Moreover, new terms are constantly being generated in the biomedical community, which makes it difficult to keep the translation dictionaries up to date for all language pairs of interest. Given a biomedical term in one language (source language), we propose a method for detecting its translations in a different language (target language). Specifically, we train a binary classifier to determine whether two biomedical terms written in two languages are translations. Training such a classifier is often complicated due to the lack of common features between the source and target languages. We propose several feature space concatenation methods to successfully overcome this problem. Moreover, we study the effectiveness of contextual and character <jats:italic>n</jats:italic>-gram features for detecting term translations. Experiments conducted using a standard dataset for biomedical term translation show that the proposed method outperforms several competitive baseline methods in terms of mean average precision and top-<jats:italic>k</jats:italic> translation accuracy.</jats:p>

Item Type: Article
Depositing User: Symplectic Admin
Date Deposited: 26 Oct 2016 09:20
Last Modified: 19 Jan 2023 07:27
DOI: 10.1017/S1351324915000431
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3003967