Using $k$-way Co-occurrences for Learning Word Embeddings

Bollegala, Danushka ORCID: 0000-0003-4476-7003, Yoshida, Yuichi and Kawarabayashi, Ken-ichi (2018) Using $k$-way Co-occurrences for Learning Word Embeddings. Proceedings of the National Conference on Artificial Intelligence, abs/17. pp. 5037-5044.

Text
1709.01199v1.pdf - Author Accepted Manuscript
Download (668kB)

Abstract

Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning have used co-occurrences between two words as the training signal for learning word embeddings. However, in natural language texts it is common for multiple words to be related and co-occurring in the same context. We extend the notion of co-occurrences to cover $k(\geq\!\!2)$-way co-occurrences among a set of $k$-words. Specifically, we prove a theoretical relationship between the joint probability of $k(\geq\!\!2)$ words, and the sum of $\ell_2$ norms of their embeddings. Next, we propose a learning objective motivated by our theoretical result that utilises $k$-way co-occurrences for learning word embeddings. Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data sparsity, for some smaller $k$ values, $k$-way embeddings perform comparably or better than $2$-way embeddings in a range of tasks.

Item Type:	Article
Uncontrolled Keywords:	cs.CL, cs.CL
Depositing User:	Symplectic Admin
Date Deposited:	11 Sep 2017 09:48
Last Modified:	19 Jan 2023 06:55
Related URLs:	Author
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3009360