语法何时介导神经语言模型表现？辍学探针的证据

论文标题

语法何时介导神经语言模型表现？辍学探针的证据

When Does Syntax Mediate Neural Language Model Performance? Evidence from Dropout Probes

论文作者

Tucker, Mycal, Eisape, Tiwalayo, Qian, Peng, Levy, Roger, Shah, Julie

论文摘要

最近的因果探测文献揭示了语言模型和句法探针何时使用类似的表示。这样的技术可能会产生“假阴性”因果关系结果：模型可以使用语法的表示，但是探针可能已经学会了使用相同语法信息的冗余编码。我们证明模型确实会冗余地编码句法信息，并引入一种新的探针设计，该设计指导探针考虑嵌入中存在的所有句法信息。使用这些探针，我们找到了在先前方法没有的模型中使用语法的证据，从而使我们能够通过将句法信息注入表示形式来提高模型性能。

Recent causal probing literature reveals when language models and syntactic probes use similar representations. Such techniques may yield "false negative" causality results: models may use representations of syntax, but probes may have learned to use redundant encodings of the same syntactic information. We demonstrate that models do encode syntactic information redundantly and introduce a new probe design that guides probes to consider all syntactic information present in embeddings. Using these probes, we find evidence for the use of syntax in models where prior methods did not, allowing us to boost model performance by injecting syntactic information into representations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题