论文标题
语言模型的批判性思维
Critical Thinking for Language Models
论文作者
论文摘要
本文朝着神经自动回归语言模型的批判性思维课程迈出了第一步。我们引入了演绎有效论证的合成语料库,并生成人工论证文本以训练和评估GPT-2。可以观察到重要的转移学习效果:对三个简单的核心方案进行训练模型,也可以准确地完成不同和更复杂类型的参数的结论。语言模型以正确的方式概括了核心参数方案。此外,我们获得了NLU基准测试的一致和有希望的结果。特别是,对论点方案的预训练提高了胶合诊断的零弹药精度,最多提高了15个百分点。研究结果表明,对体现基本推理能力(例如通常在批判性思维教科书中通常涵盖的)的文本进行中介预培训可能有助于语言模型获得广泛的推理技能。本文介绍的综合论证文本是建立这种“为语言模型的批判性思维课程”的有希望的起点。
This paper takes a first step towards a critical thinking curriculum for neural auto-regressive language models. We introduce a synthetic corpus of deductively valid arguments, and generate artificial argumentative texts to train and evaluate GPT-2. Significant transfer learning effects can be observed: Training a model on three simple core schemes allows it to accurately complete conclusions of different, and more complex types of arguments, too. The language models generalize the core argument schemes in a correct way. Moreover, we obtain consistent and promising results for NLU benchmarks. In particular, pre-training on the argument schemes raises zero-shot accuracy on the GLUE diagnostics by up to 15 percentage points. The findings suggest that intermediary pre-training on texts that exemplify basic reasoning abilities (such as typically covered in critical thinking textbooks) might help language models to acquire a broad range of reasoning skills. The synthetic argumentative texts presented in this paper are a promising starting point for building such a "critical thinking curriculum for language models."