Jameel, Shoaib, Schockaert, Steven (2019) Word and Document Embedding with vMF-Mixture Priors on Context Word Vectors. In: The 57th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference. . ACL ISBN 978-1-950737-48-2. (KAR id:74583)
PDF
Author's Accepted Manuscript
Language: English |
|
Download this file (PDF/296kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: https://www.aclweb.org/anthology/events/acl-2019/ |
Abstract
Word embedding models typically learn two types of vectors: target word vectors and context word vectors. These vectors are normally learned such that they are predictive of some word co-occurrence statistic, but they are otherwise unconstrained. However, the words from a given language can be organized in various natural groupings, such as syntactic word classes (e.g. nouns, adjectives, verbs) and semantic themes (e.g. sports, politics, sentiment). Our hypothesis in this paper is that embedding models can be improved by explicitly imposing a cluster structure on the set of context word vectors. To this end, our model relies on the assumption that context word vectors are drawn from a mixture of von Mises- Fisher (vMF) distributions, where the parameters of this mixture distribution are jointly optimized with the word vectors. We show that this results in word vectors which are qualitatively different from those obtained with existing word embedding models. We furthermore show that our embedding model can also be used to learn high-quality document representations.
Item Type: | Conference or workshop item (Proceeding) |
---|---|
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Shoaib Jameel |
Date Deposited: | 26 Jun 2019 08:38 UTC |
Last Modified: | 05 Nov 2024 12:37 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/74583 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):