对话 - 自适应语言模型从质量估计中进行预训练

论文标题

对话 - 自适应语言模型从质量估计中进行预训练

Dialogue-adaptive Language Model Pre-training From Quality Estimation

论文作者

Li, Junlong, Zhang, Zhuosheng, Zhao, Hai

论文摘要

预先训练的语言模型（PRLMS）通过在大型语料库上的自我监督学习获得的通用语言表示能力，在各种自然语言处理任务上取得了巨大的成功。这些模型是在具有通用语言模型（LM）培训目标的标准纯文本上进行训练的，这些模型不足以模拟对话 - 独立属性（如特异性和信息性）所反映的这些任务，这些属性未被预先训练的通用语言表示未明确捕获。在这项工作中，我们提出了从质量估算中得出的对话 - 自适应预训练目标（DAPO），以模拟特定于对话的特征，即连贯性，特异性和信息性。作为模型预训练的基础，我们综合了一个新的对话语料库，并使用两种无监督的方法来建立我们的训练集：1）以相干为导向的上下文损坏，包括话语排序，插入和替换，以帮助模型在对话环境中捕获相干性； 2）面向特异性的自动逆转，这鼓励模型通过考虑特异性和信息性来衡量综合数据的质量，以进行对话适应性预训练。对广泛使用的开放域响应选择和质量估计基准的实验结果表明，DAPO显着改善了基线模型，并在相互排行榜上实现了最先进的性能，从而验证了将质量评估因素估算为预训练的有效性。

Pre-trained language models (PrLMs) have achieved great success on a wide range of natural language processing tasks by virtue of the universal language representation ability obtained by self-supervised learning on a large corpus. These models are pre-trained on standard plain texts with general language model (LM) training objectives, which would be insufficient to model dialogue-exclusive attributes like specificity and informativeness reflected in these tasks that are not explicitly captured by the pre-trained universal language representations. In this work, we propose dialogue-adaptive pre-training objectives (DAPO) derived from quality estimation to simulate dialogue-specific features, namely coherence, specificity, and informativeness. As the foundation for model pre-training, we synthesize a new dialogue corpus and build our training set with two unsupervised methods: 1) coherence-oriented context corruption, including utterance ordering, insertion, and replacement, to help the model capture the coherence inside the dialogue contexts; and 2) specificity-oriented automatic rescoring, which encourages the model to measure the quality of the synthesized data for dialogue-adaptive pre-training by considering specificity and informativeness. Experimental results on widely used open-domain response selection and quality estimation benchmarks show that DAPO significantly improves the baseline models and achieves state-of-the-art performance on the MuTual leaderboard, verifying the effectiveness of estimating quality evaluation factors into pre-training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题