论文标题
通过验证者引导搜索生成自然语言证明
Generating Natural Language Proofs with Verifier-Guided Search
论文作者
论文摘要
在自然语言上推理是NLP的一个具有挑战性的问题。在这项工作中,我们专注于证明生成:鉴于假设和一组支持事实,该模型生成了一个证明树,指示如何从支持事实中得出假设。与一击生成整个证明相比,逐步生成可以更好地利用构图并推广到更长的证明,但在现实世界中取得了有限的成功。现有的逐步方法难以生成逻辑上有效且与假设相关的证明步骤。取而代之的是,考虑到假设,它们倾向于幻觉无效的步骤。在本文中,我们提出了一种新颖的逐步方法,即NLProofs(自然语言搜索),该方法学会了为假设生成相关的步骤。从我们的方法的核心中,我们训练一个独立的验证者,以检查防止幻觉的证明步骤的有效性。我们没有生成贪婪的步骤,而是寻找最大化验证者判断的全球证明分数的证据。 Nlproofs在IntailmentBank和Ruletaker上实现了最先进的表现。具体而言,在IntailmentBank的干扰设置中,它将预测证明的正确性从27.7%提高到33.3%,证明了NLPROFFROVES在产生具有挑战性的人为实现的证据方面的有效性。
Reasoning over natural language is a challenging problem in NLP. In this work, we focus on proof generation: Given a hypothesis and a set of supporting facts, the model generates a proof tree indicating how to derive the hypothesis from supporting facts. Compared to generating the entire proof in one shot, stepwise generation can better exploit the compositionality and generalize to longer proofs but has achieved limited success on real-world data. Existing stepwise methods struggle to generate proof steps that are both logically valid and relevant to the hypothesis. Instead, they tend to hallucinate invalid steps given the hypothesis. In this paper, we present a novel stepwise method, NLProofS (Natural Language Proof Search), which learns to generate relevant steps conditioning on the hypothesis. At the core of our approach, we train an independent verifier to check the validity of the proof steps to prevent hallucination. Instead of generating steps greedily, we search for proofs maximizing a global proof score judged by the verifier. NLProofS achieves state-of-the-art performance on EntailmentBank and RuleTaker. Specifically, it improves the correctness of predicted proofs from 27.7% to 33.3% in the distractor setting of EntailmentBank, demonstrating the effectiveness of NLProofS in generating challenging human-authored proofs.