论文标题

对话 - 自适应语言模型从质量估计中进行预训练

Dialogue-adaptive Language Model Pre-training From Quality Estimation

论文作者

Li, Junlong, Zhang, Zhuosheng, Zhao, Hai

论文摘要

预先训练的语言模型(PRLMS)通过在大型语料库上的自我监督学习获得的通用语言表示能力,在各种自然语言处理任务上取得了巨大的成功。这些模型是在具有通用语言模型(LM)培训目标的标准纯文本上进行训练的,这些模型不足以模拟对话 - 独立属性(如特异性和信息性)所反映的这些任务,这些属性未被预先训练的通用语言表示未明确捕获。在这项工作中,我们提出了从质量估算中得出的对话 - 自适应预训练目标(DAPO),以模拟特定于对话的特征,即连贯性,特异性和信息性。作为模型预训练的基础,我们综合了一个新的对话语料库,并使用两种无监督的方法来建立我们的训练集:1)以相干为导向的上下文损坏,包括话语排序,插入和替换,以帮助模型在对话环境中捕获相干性; 2)面向特异性的自动逆转,这鼓励模型通过考虑特异性和信息性来衡量综合数据的质量,以进行对话适应性预训练。对广泛使用的开放域响应选择和质量估计基准的实验结果表明,DAPO显着改善了基线模型,并在相互排行榜上实现了最先进的性能,从而验证了将质量评估因素估算为预训练的有效性。

Pre-trained language models (PrLMs) have achieved great success on a wide range of natural language processing tasks by virtue of the universal language representation ability obtained by self-supervised learning on a large corpus. These models are pre-trained on standard plain texts with general language model (LM) training objectives, which would be insufficient to model dialogue-exclusive attributes like specificity and informativeness reflected in these tasks that are not explicitly captured by the pre-trained universal language representations. In this work, we propose dialogue-adaptive pre-training objectives (DAPO) derived from quality estimation to simulate dialogue-specific features, namely coherence, specificity, and informativeness. As the foundation for model pre-training, we synthesize a new dialogue corpus and build our training set with two unsupervised methods: 1) coherence-oriented context corruption, including utterance ordering, insertion, and replacement, to help the model capture the coherence inside the dialogue contexts; and 2) specificity-oriented automatic rescoring, which encourages the model to measure the quality of the synthesized data for dialogue-adaptive pre-training by considering specificity and informativeness. Experimental results on widely used open-domain response selection and quality estimation benchmarks show that DAPO significantly improves the baseline models and achieves state-of-the-art performance on the MuTual leaderboard, verifying the effectiveness of estimating quality evaluation factors into pre-training.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源