A Study on Learning Representations for Relations Between Words



Hakami, Huda
(2020) A Study on Learning Representations for Relations Between Words. PhD thesis, University of Liverpool.

[img] Text
200976570_May2020.pdf - Unspecified

Download (5MB) | Preview

Abstract

Reasoning about relations between words or entities plays an important role in human cognition. It is thus essential for a computational system which processes human languages to be able to understand the semantics of relations to simulate human intelligence. Automatic relation learning provides valuable information for many natural language processing tasks including ontology creation, question answering and machine translation, to name a few. This need brings us to the topic of this thesis where the main goal is to explore multiple resources and methodologies to effectively represent relations between words. How to effectively represent semantic relations between words remains a problem that is underexplored. A line of research makes use of relational patterns, which are the linguistic contexts in which two words co-occur in a corpus to infer a relation between them (e.g., X leads to Y). This approach suffers from data sparseness because not every related word-pair co-occurs even in a large corpus. In contrast, prior work on learning word embeddings have found that certain relations between words could be captured by applying linear arithmetic operators on the corresponding pre-trained word embeddings. Specifically, it has been shown that the vector offset (expressed as PairDiff) from one word to the other in a pair encodes the relation that holds between them, if any. Such a compositional method addresses the data sparseness by inferring a relation from constituent words in a word-pair and obviates the need of relational patterns. This thesis investigates the best way to compose word embeddings to represent relational instances. A systematic comparison is carried out for unsupervised operators, which in general reveals the superiority of the PairDiff operator on multiple word embedding models and benchmark datasets. Despite the empirical success, no theoretical analysis has been conducted so far explaining why and under what conditions PairDiff is optimal. To this end, a theoretical analysis is conducted for the generalised bilinear operators that can be used to measure the relational distance between two word-pairs. The main conclusion is that, under certain assumptions, the bilinear operator can be simplified to a linear form, where the widely used PairDiff operator is a special case. Multiple recent works raised concerns about existing unsupervised operators for inferring relations from pre-trained word embeddings. Thus, the question of whether it is possible to learn better parametrised relational compositional operators is addressed in this thesis. A supervised relation representation operator is proposed using a non-linear neural network that performs relation prediction. The evaluation on two benchmark datasets reveals that the penultimate layer of the trained neural network-based relational predictor acts as a good representation for the relations between words. Because we believe that both relational patterns and word embeddings provide complementary information to learn relations, a self-supervised context-guided relation embedding method that is trained on the two sources of information has been proposed. Experimentally, incorporating relational contexts shows improvement in the performance of a compositional operator for representing unseen word-pairs. Besides unstructured text corpora, knowledge graphs provide another source for relational facts in the form of nodes (i.e., entities) connected by edges (i.e., relations). Knowledge graphs are employed widely in natural language processing applications such as question answering and dialogue systems. Embedding entities and relations in a graph have shown impressive results for inferring previously unseen relations between entities. This thesis contributes to developing a theoretical model to infer a relationship between the connections in the graph and the embeddings of entities and relations. Learning graph embeddings that satisfy the proven theorem demonstrates efficient performance compared to existing heuristically derived graph embedding methods. As graph embedding methods generate representations for only existing relation types, a relation composition task is proposed in the thesis to tackle this limitation.

Item Type: Thesis (PhD)
Uncontrolled Keywords: Natural Language Processing, Machine Learning, Relation Representations
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 10 Jun 2020 08:54
Last Modified: 18 Jan 2023 23:50
DOI: 10.17638/03089338
Supervisors:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3089338