论文标题
多语言编码者如何学习跨语性表示?
How Do Multilingual Encoders Learn Cross-lingual Representation?
论文作者
论文摘要
NLP系统通常需要支持多种语言。由于不同的语言具有不同数量的监督,因此通过从其他语言转移而几乎没有培训数据的跨语义转移福利语言。从工程的角度来看,多语言NLP通过使用单个系统提供多种语言来利用开发和维护。跨语性转移和多语言NLP都取决于跨语性表示形式。随着伯特(Bert)彻底改变代表性学习和NLP,它也彻底改变了跨语性表示和跨语性转移。多语言BERT是作为单语言Bert的替代者发布的,该语言以104种语言训练了Wikipedia数据。 令人惊讶的是,除了单个语言的表示外,多语言Bert没有任何明确的跨语性信号,还可以学习跨语性表示。该论文首先显示了与先前在各种任务上的艺术相比,跨语性的效率令人惊讶。自然,它提出了一系列问题,最著名的是这些多语言编码者如何学习跨语性表示。在探索这些问题时,本论文将分析在高和低资源语言上各种环境中多语言模型的行为。我们还研究了如何将不同的跨语性信号注入多语言编码器,以及使用这些模型对跨语性转移的优化行为。他们一起,可以更好地理解有关跨语性转移的多语言编码器。我们的发现将使我们对多语言编码器和跨语性转移进行建议改进。
NLP systems typically require support for more than one language. As different languages have different amounts of supervision, cross-lingual transfer benefits languages with little to no training data by transferring from other languages. From an engineering perspective, multilingual NLP benefits development and maintenance by serving multiple languages with a single system. Both cross-lingual transfer and multilingual NLP rely on cross-lingual representations serving as the foundation. As BERT revolutionized representation learning and NLP, it also revolutionized cross-lingual representations and cross-lingual transfer. Multilingual BERT was released as a replacement for single-language BERT, trained with Wikipedia data in 104 languages. Surprisingly, without any explicit cross-lingual signal, multilingual BERT learns cross-lingual representations in addition to representations for individual languages. This thesis first shows such surprising cross-lingual effectiveness compared against prior art on various tasks. Naturally, it raises a set of questions, most notably how do these multilingual encoders learn cross-lingual representations. In exploring these questions, this thesis will analyze the behavior of multilingual models in a variety of settings on high and low resource languages. We also look at how to inject different cross-lingual signals into multilingual encoders, and the optimization behavior of cross-lingual transfer with these models. Together, they provide a better understanding of multilingual encoders on cross-lingual transfer. Our findings will lead us to suggested improvements to multilingual encoders and cross-lingual transfer.