变压器是语言的软推理器

论文标题

变压器是语言的软推理器

Transformers as Soft Reasoners over Language

论文作者

Clark, Peter, Tafjord, Oyvind, Richardson, Kyle

论文摘要

从麦卡锡（McCarthy）的《建议者》（Taker，1959）开始，AI的目标是提供一个具有明确，常识的系统，并在该知识上具有系统原因。但是，以正式（逻辑或概率）表示以此为主的知识一直是这项研究的主要障碍。本文调查了一种现代方法，解决了事实和规则作为自然语言句子，从而绕开了正式的代表。我们使用合成生成的数据训练变形金刚在这些句子上推理（或模仿推理）。我们称之为RuleTakers的模型提供了第一个经验证明，即这种软语言的软推理是可以学习的，可以达到高（99％）的精度，并且可以概括地测试需要比训练期间看到的更深的链接的数据（95％+分数）。我们还证明，这些模型可以很好地转移到两个由手工的规则库中，并将其解释为更自然的语言。这些发现很重要，因为它暗示了变形金刚的新作用，即在语言上的显式理论上运作的“软定理掠夺”有限。这反过来又提出了解释性，正确性和反事实推理的新可能性。

Beginning with McCarthy's Advice Taker (1959), AI has pursued the goal of providing a system with explicit, general knowledge and having the system reason over that knowledge. However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research. This paper investigates a modern approach to this problem where the facts and rules are provided as natural language sentences, thus bypassing a formal representation. We train transformers to reason (or emulate reasoning) over these sentences using synthetically generated data. Our models, that we call RuleTakers, provide the first empirical demonstration that this kind of soft reasoning over language is learnable, can achieve high (99%) accuracy, and generalizes to test data requiring substantially deeper chaining than seen during training (95%+ scores). We also demonstrate that the models transfer well to two hand-authored rulebases, and to rulebases paraphrased into more natural language. These findings are significant as it suggests a new role for transformers, namely as limited "soft theorem provers" operating over explicit theories in language. This in turn suggests new possibilities for explainability, correctability, and counterfactual reasoning in question-answering.

下载PDF全文

下载文献需遵守相关版权规定

论文标题