寻找域特定神经机器翻译的稀疏结构

论文标题

寻找域特定神经机器翻译的稀疏结构

Finding Sparse Structures for Domain Specific Neural Machine Translation

论文作者

Liang, Jianze, Zhao, Chengqi, Wang, Mingxuan, Qiu, Xipeng, Li, Lei

论文摘要

神经机器翻译通常采用微调方法来适应特定领域。但是，非限制的微调很容易在通用域上降低并过度拟合到目标域。为了减轻问题，我们提出了Prune-Tune，这是一种通过逐步修剪的新型域适应方法。它在新域进行微调期间学习了微小的域特异性子网络。 rune-tune无需修改模型，可以减轻过度拟合和降解问题。此外，Prune-tune能够依次学习具有多个域特异性子网络的单个网络。经验实验结果表明，Prune-Tune在目标域测试集中的几个强大竞争对手的表现，而无需牺牲单个和多域设置中的通用域上的质量。源代码和数据可从https://github.com/ohlionel/prune-tune获得。

Neural machine translation often adopts the fine-tuning approach to adapt to specific domains. However, nonrestricted fine-tuning can easily degrade on the general domain and over-fit to the target domain. To mitigate the issue, we propose Prune-Tune, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific sub-networks during fine-tuning on new domains. Prune-Tune alleviates the over-fitting and the degradation problem without model modification. Furthermore, Prune-Tune is able to sequentially learn a single network with multiple disjoint domain-specific sub-networks for multiple domains. Empirical experiment results show that Prune-Tune outperforms several strong competitors in the target domain test set without sacrificing the quality on the general domain in both single and multi-domain settings. The source code and data are available at https://github.com/ohlionel/Prune-Tune.

下载PDF全文

下载文献需遵守相关版权规定

论文标题