虚假：生成文档级的NLI示例，以识别摘要中的事实不一致

论文标题

虚假：生成文档级的NLI示例，以识别摘要中的事实不一致

Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

论文作者

Utama, Prasetya Ajie, Bambrick, Joshua, Moosavi, Nafise Sadat, Gurevych, Iryna

论文摘要

神经抽象的摘要模型很容易产生摘要，这些摘要实际上与其源文档不一致。先前的工作已经介绍了识别这种事实不一致的任务，例如自然语言推断（NLI）的下游应用。但是，最新的NLI模型在这种情况下的性能很差，因为它们无法概括目标任务。在这项工作中，我们表明，当通过高质量的以任务为导向的示例增强培训数据时，NLI模型对于此任务有效。我们介绍了虚假的数据生成管道，利用可控制的文本生成模型来扰动人类通知的摘要，引入了不同类型的事实矛盾。与以前引入的文档级NLI数据集不同，我们生成的数据集包含多种多样且不一致但合理的示例。我们表明，经过虚假的NLI数据集培训的模型改善了四个基准的最新性能，以检测摘要中的事实不一致。获取数据集的代码可在线可在https://github.com/joshbambrick/falsesum在线获得

Neural abstractive summarization models are prone to generate summaries which are factually inconsistent with their source documents. Previous work has introduced the task of recognizing such factual inconsistency as a downstream application of natural language inference (NLI). However, state-of-the-art NLI models perform poorly in this context due to their inability to generalize to the target task. In this work, we show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries, introducing varying types of factual inconsistencies. Unlike previously introduced document-level NLI datasets, our generated dataset contains examples that are diverse and inconsistent yet plausible. We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization. The code to obtain the dataset is available online at https://github.com/joshbambrick/Falsesum

下载PDF全文

下载文献需遵守相关版权规定

论文标题