基于变压器的自动编码器的有效培训目标

论文标题

基于变压器的自动编码器的有效培训目标

Effective Pre-Training Objectives for Transformer-based Autoencoders

论文作者

Di Liello, Luca, Gabburo, Matteo, Moschitti, Alessandro

论文摘要

在本文中，我们研究了具有不同训练预训练目标的预训练变压器编码的效率，成本和准确性之间的权衡。为此，我们分析了共同目标的特征，并将它们结合起来以创建新的有效的预训练方法。具体而言，我们基于直接的统计方法设计了光令牌发电机，该方法可以替代电计算上的重型发电机，从而高度降低成本。我们的实验还表明，（i）BERT的MLM有更有效的替代方案，并且（ii）有可能使用较轻的发电机有效地预先培训型变压器模型，而不会显着下降。

In this paper, we study trade-offs between efficiency, cost and accuracy when pre-training Transformer encoders with different pre-training objectives. For this purpose, we analyze features of common objectives and combine them to create new effective pre-training approaches. Specifically, we designed light token generators based on a straightforward statistical approach, which can replace ELECTRA computationally heavy generators, thus highly reducing cost. Our experiments also show that (i) there are more efficient alternatives to BERT's MLM, and (ii) it is possible to efficiently pre-train Transformer-based models using lighter generators without a significant drop in performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题