多语言双向无监督的翻译通过多语言填充和反翻译

论文标题

多语言双向无监督的翻译通过多语言填充和反翻译

Multilingual Bidirectional Unsupervised Translation Through Multilingual Finetuning and Back-Translation

论文作者

Li, Bryan, Rasooli, Mohammad Sadegh, Patel, Ajay, Callison-Burch, Chris

论文摘要

我们提出了一种两阶段的方法，用于训练单个NMT模型，以翻译英语和英语的看不见的语言。对于第一阶段，我们将一个编码器模型初始化以鉴定XLM-R和Roberta的权重，然后对40种语言的平行数据进行多种语言微调。我们发现该模型可以推广到对看不见的语言的零击翻译。在第二阶段，我们利用这种概括能力从单语数据集生成合成的并行数据，然后双向训练以连续的反向翻译训练。我们的方法（以英语为中心的跨语言（x）转移）在概念上很简单，只有在整个过程中都使用标准的跨透镜目标。它也是数据驱动的，依次利用辅助并行数据和单语言数据。我们评估了7种低资源语言的无监督NMT结果，并发现每一轮的反向翻译训练进一步完善了双向性能。我们最终的单一ECXTRA训练模型在所有翻译方向上都实现了竞争性的翻译性能，特别是为英语至哈萨克的新最先进（22.9> 10.4 BLEU）建立了新的最先进。我们的代码可在https://github.com/manestay/ecxtra上找到。

We propose a two-stage approach for training a single NMT model to translate unseen languages both to and from English. For the first stage, we initialize an encoder-decoder model to pretrained XLM-R and RoBERTa weights, then perform multilingual fine-tuning on parallel data in 40 languages to English. We find this model can generalize to zero-shot translations on unseen languages. For the second stage, we leverage this generalization ability to generate synthetic parallel data from monolingual datasets, then bidirectionally train with successive rounds of back-translation. Our approach, which we EcXTra (English-centric Crosslingual (X) Transfer), is conceptually simple, only using a standard cross-entropy objective throughout. It is also data-driven, sequentially leveraging auxiliary parallel data and monolingual data. We evaluate unsupervised NMT results for 7 low-resource languages, and find that each round of back-translation training further refines bidirectional performance. Our final single EcXTra-trained model achieves competitive translation performance in all translation directions, notably establishing a new state-of-the-art for English-to-Kazakh (22.9 > 10.4 BLEU). Our code is available at https://github.com/manestay/EcXTra .

下载PDF全文

下载文献需遵守相关版权规定

论文标题