Skip to main content

Word and Document Embedding with vMF-Mixture Priors on Context Word Vectors

Jameel, Shoaib, Schockaert, Steven (2019) Word and Document Embedding with vMF-Mixture Priors on Context Word Vectors. In: The 57th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference. . ACL ISBN 978-1-950737-48-2.

PDF - Author's Accepted Manuscript
Download (351kB) Preview Download (351kB)
[img]
Preview
Official URL
https://www.aclweb.org/anthology/events/acl-2019/

Abstract

Word embedding models typically learn two types of vectors: target word vectors and context word vectors. These vectors are normally learned such that they are predictive of some word co-occurrence statistic, but they are otherwise unconstrained. However, the words from a given language can be organized in various natural groupings, such as syntactic word classes (e.g. nouns, adjectives, verbs) and semantic themes (e.g. sports, politics, sentiment). Our hypothesis in this paper is that embedding models can be improved by explicitly imposing a cluster structure on the set of context word vectors. To this end, our model relies on the assumption that context word vectors are drawn from a mixture of von Mises- Fisher (vMF) distributions, where the parameters of this mixture distribution are jointly optimized with the word vectors. We show that this results in word vectors which are qualitatively different from those obtained with existing word embedding models. We furthermore show that our embedding model can also be used to learn high-quality document representations.

Item Type: Conference or workshop item (Proceeding)
Divisions: Faculties > Sciences > School of Computing > Data Science
Depositing User: Shoaib Jameel
Date Deposited: 26 Jun 2019 08:38 UTC
Last Modified: 13 Feb 2020 16:39 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/74583 (The current URI for this page, for reference purposes)
  • Depositors only (login required):

Downloads

Downloads per month over past year