论文标题

极端手套:理论上准确的分布式单词嵌入尾部推理

Extremal GloVe: Theoretically Accurate Distributed Word Embedding by Tail Inference

论文作者

Wang, Hao

论文摘要

诸如Word2Vec和Glove之类的分布式单词嵌入在工业环境中已被广泛采用。手套的主要技术应用包括推荐系统和自然语言处理。手套背后的基本理论依赖于在加权最小SQURE公式中选择加权函数,该配方计算单词出现计数的动力比率和语料库中的最大单词计数。但是,手套的初始配方在两个方面都不是在两个方面声音,即加权函数及其功率指数的选择是临时的。在本文中,我们利用了极值分析的理论,并提出了理论上准确的手套版本。通过将加权最小二乘损失函数重新启动为预期损耗函数并准确选择功率指数,我们创建了理论上准确的手套版本。我们证明了算法的竞争力,并表明,用建议的最佳参数的手套的初始配方可以看作​​是我们范式的特殊情况。

Distributed word embeddings such as Word2Vec and GloVe have been widely adopted in industrial context settings. Major technical applications of GloVe include recommender systems and natural language processing. The fundamental theory behind GloVe relies on the selection of a weighting function in the weighted least squres formulation that computes the powered ratio of word occurrence count and the maximum word count in the corpus. However, the initial formulation of GloVe is not theoretically sound in two aspects, namely the selection of the weighting function and its power exponent is ad-hoc. In this paper, we utilize the theory of extreme value analysis and propose a theoretically accurate version of GloVe. By reformulating the weighted least squares loss function as the expected loss function and accurately choosing the power exponent, we create a theoretically accurate version of GloVe. We demonstrate the competitiveness of our algorithm and show that the initial formulation of GloVe with the suggested optimal parameter can be viewed as a special case of our paradigm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源