表达日志固定变压器的逻辑

论文标题

表达日志固定变压器的逻辑

A Logic for Expressing Log-Precision Transformers

论文作者

Merrill, William, Sabharwal, Ashish

论文摘要

解释基于变压器语言模型的推理能力的一种方法是描述他们可以通过某些输入文本解决的逻辑规则的类型。最近，Chiang等。（2023）表明，有限晶变压器可以在一阶逻辑的概括中等效地表达。但是，有限前提的变压器是一个弱变压器的变体，因为正如我们所显示的那样，一个头只能注意恒定数量的令牌，尤其是不能代表统一的注意力。由于广泛参加是变形金刚的核心能力，因此我们询问是否可以在逻辑上表征一个可以普遍参加的更具表现力的模型。为此，我们分析了在$ n $的上下文上以$ \ log n $精度计算的正向通行的变形金刚。我们证明，任何log-percision变压器都可以等效地表示为一阶逻辑句子，除了标准通用和存在的量词外，还可以包含多数元素量词。这是对数十字形变压器的最紧密的上限和第一个逻辑表征。

One way to interpret the reasoning power of transformer-based language models is to describe the types of logical rules they can resolve over some input text. Recently, Chiang et al. (2023) showed that finite-precision transformers can be equivalently expressed in a generalization of first-order logic. However, finite-precision transformers are a weak transformer variant because, as we show, a single head can only attend to a constant number of tokens and, in particular, cannot represent uniform attention. Since attending broadly is a core capability for transformers, we ask whether a minimally more expressive model that can attend universally can also be characterized in logic. To this end, we analyze transformers whose forward pass is computed in $\log n$ precision on contexts of length $n$. We prove that any log-precision transformer can be equivalently expressed as a first-order logic sentence that, in addition to standard universal and existential quantifiers, may also contain majority-vote quantifiers. This is the tightest known upper bound and first logical characterization of log-precision transformers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题