论文标题

多语言双向无监督的翻译通过多语言填充和反翻译

Multilingual Bidirectional Unsupervised Translation Through Multilingual Finetuning and Back-Translation

论文作者

Li, Bryan, Rasooli, Mohammad Sadegh, Patel, Ajay, Callison-Burch, Chris

论文摘要

我们提出了一种两阶段的方法,用于训练单个NMT模型,以翻译英语和英语的看不见的语言。对于第一阶段,我们将一个编码器模型初始化以鉴定XLM-R和Roberta的权重,然后对40种语言的平行数据进行多种语言微调。我们发现该模型可以推广到对看不见的语言的零击翻译。在第二阶段,我们利用这种概括能力从单语数据集生成合成的并行数据,然后双向训练以连续的​​反向翻译训练。 我们的方法(以英语为中心的跨语言(x)转移)在概念上很简单,只有在整个过程中都使用标准的跨透镜目标。它也是数据驱动的,依次利用辅助并行数据和单语言数据。我们评估了7种低资源语言的无监督NMT结果,并发现每一轮的反向翻译训练进一步完善了双向性能。我们最终的单一ECXTRA训练模型在所有翻译方向上都实现了竞争性的翻译性能,特别是为英语至哈萨克的新最先进(22.9> 10.4 BLEU)建立了新的最先进。我们的代码可在https://github.com/manestay/ecxtra上找到。

We propose a two-stage approach for training a single NMT model to translate unseen languages both to and from English. For the first stage, we initialize an encoder-decoder model to pretrained XLM-R and RoBERTa weights, then perform multilingual fine-tuning on parallel data in 40 languages to English. We find this model can generalize to zero-shot translations on unseen languages. For the second stage, we leverage this generalization ability to generate synthetic parallel data from monolingual datasets, then bidirectionally train with successive rounds of back-translation. Our approach, which we EcXTra (English-centric Crosslingual (X) Transfer), is conceptually simple, only using a standard cross-entropy objective throughout. It is also data-driven, sequentially leveraging auxiliary parallel data and monolingual data. We evaluate unsupervised NMT results for 7 low-resource languages, and find that each round of back-translation training further refines bidirectional performance. Our final single EcXTra-trained model achieves competitive translation performance in all translation directions, notably establishing a new state-of-the-art for English-to-Kazakh (22.9 > 10.4 BLEU). Our code is available at https://github.com/manestay/EcXTra .

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源