论文标题
部分输入的基准表明NLI模型可以忽略上下文,但它们不忽略上下文
Partial-input baselines show that NLI models can ignore context, but they don't
论文作者
论文摘要
当强有力的部分输入基线显示众包NLI数据集中的工件时,在此类数据集上训练的全输入模型的性能通常被视为依赖虚假相关性。我们调查了最新的NLI模型是否能够覆盖部分输入基线的默认推断。我们介绍了一个评估集的600个示例,该示例包括扰动前提,以检查罗伯塔模型对编辑环境的敏感性。我们的结果表明,NLI模型仍然能够在上下文上学习条件 - 推理推理的必要组成部分,尽管对rantifact-Ridden的数据集进行了培训。
When strong partial-input baselines reveal artifacts in crowdsourced NLI datasets, the performance of full-input models trained on such datasets is often dismissed as reliance on spurious correlations. We investigate whether state-of-the-art NLI models are capable of overriding default inferences made by a partial-input baseline. We introduce an evaluation set of 600 examples consisting of perturbed premises to examine a RoBERTa model's sensitivity to edited contexts. Our results indicate that NLI models are still capable of learning to condition on context--a necessary component of inferential reasoning--despite being trained on artifact-ridden datasets.