跨语性单词嵌入的多反面学习

论文标题

跨语性单词嵌入的多反面学习

Multi-Adversarial Learning for Cross-Lingual Word Embeddings

论文作者

Wang, Haozhou, Henderson, James, Merlo, Paola

论文摘要

生成的对抗网络（GAN）成功地诱导了跨语性的单词嵌入 - 跨语言匹配的单词的地图 - 没有监督。尽管取得了这些成功，但甘斯（Gans）在遥远语言的困难情况下的表现仍然不满意。这些局限性通过GAN的错误假设来解释，即源和目标嵌入空间与单个线性映射相关，并且大致是同构的。相反，我们假设，尤其是在遥远的语言中，映射仅是线性的，并提出了一种多反向学习方法。这种新颖的方法通过多个映射诱导种子跨语性词典，每种映射均可适合一个子空间的映射。我们对无监督双语词典诱导的实验表明，这种方法改善了以前的单映射方法的性能，尤其是对于遥远的语言。

Generative adversarial networks (GANs) have succeeded in inducing cross-lingual word embeddings -- maps of matching words across languages -- without supervision. Despite these successes, GANs' performance for the difficult case of distant languages is still not satisfactory. These limitations have been explained by GANs' incorrect assumption that source and target embedding spaces are related by a single linear mapping and are approximately isomorphic. We assume instead that, especially across distant languages, the mapping is only piece-wise linear, and propose a multi-adversarial learning method. This novel method induces the seed cross-lingual dictionary through multiple mappings, each induced to fit the mapping for one subspace. Our experiments on unsupervised bilingual lexicon induction show that this method improves performance over previous single-mapping methods, especially for distant languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题