匹配推出：通过及时学习提高神经文本匹配的多任务概括能力

论文标题

匹配推出：通过及时学习提高神经文本匹配的多任务概括能力

Match-Prompt: Improving Multi-task Generalization Ability for Neural Text Matching via Prompt Learning

论文作者

Xu, Shicheng, Pang, Liang, Shen, Huawei, Cheng, Xueqi

论文摘要

文本匹配是信息检索和自然语言处理的基本技术。文本匹配任务共享确定两个给定文本之间关系的相同范式。这些关系因任务而异，例如〜在文档检索中相关性，在释义识别中的语义一致性以及有关回答的可回答判断的可回答判断。但是，文本匹配的基本信号仍然存在于有限范围中，即〜精确匹配，语义匹配和推理匹配。理想情况下，良好的文本匹配模型可以学会捕获和汇总这些信号，以实现不同的匹配任务以实现竞争性能，而最新的最新文本匹配模型，例如〜预训练的语言模型（PLM）很难概括。这是因为在特定于任务的数据集上的端到端监督学习使模型过度强调了数据示例偏差和特定于任务的信号，而不是基本的匹配信号。为了克服这个问题，我们采用了专业化将军培训策略，并将其称为匹配推出。在专业阶段，对不同匹配任务的描述将映射到一些提示令牌。在概括阶段，匹配模型通过接受各种匹配任务的培训来探索基本匹配信号。高不同的匹配任务避免了模型符合特定任务的数据偏差，以便模型可以专注于学习基本匹配信号。同时，在第一步中获得的提示令牌有助于模型区分不同的特定任务匹配信号。公共数据集上的实验结果表明，与以前的微调范式训练的多任务和特定于多任务适应性的模型相比，比赛匹配可以提高PLM在文本匹配中的多任务概括能力，并产生更好的内域多任务，外域多任务和新的任务适应性性能。

Text matching is a fundamental technique in both information retrieval and natural language processing. Text matching tasks share the same paradigm that determines the relationship between two given texts. The relationships vary from task to task, e.g.~relevance in document retrieval, semantic alignment in paraphrase identification and answerable judgment in question answering. However, the essential signals for text matching remain in a finite scope, i.e.~exact matching, semantic matching, and inference matching. Ideally, a good text matching model can learn to capture and aggregate these signals for different matching tasks to achieve competitive performance, while recent state-of-the-art text matching models, e.g.~Pre-trained Language Models (PLMs), are hard to generalize. It is because the end-to-end supervised learning on task-specific dataset makes model overemphasize the data sample bias and task-specific signals instead of the essential matching signals. To overcome this problem, we adopt a specialization-generalization training strategy and refer to it as Match-Prompt. In specialization stage, descriptions of different matching tasks are mapped to a few prompt tokens. In generalization stage, matching model explores the essential matching signals by being trained on diverse matching tasks. High diverse matching tasks avoid model fitting the data bias on a specific task, so that model can focus on learning the essential matching signals. Meanwhile, the prompt tokens obtained in the first step help the model distinguish different task-specific matching signals. Experimental results on public datasets show that Match-Prompt can improve multi-task generalization capability of PLMs in text matching and yield better in-domain multi-task, out-of-domain multi-task and new task adaptation performance than multi-task and task-specific models trained by previous fine-tuning paradigm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题