黑框语言模型通过上下文长度探测说明

论文标题

黑框语言模型通过上下文长度探测说明

Black-box language model explanation by context length probing

论文作者

Cífka, Ondřej, Liutkus, Antoine

论文摘要

大型语言模型越来越广泛地采用了提高其解释性的必要性。我们提出了上下文长度探测，这是一种针对因果语言模型的新颖解释技术，基于跟踪模型的预测作为可用上下文长度的函数，并允许将差异重要性得分分配给不同上下文。该技术是模型不合时宜的，不依赖于对代币级别概率以外的模型内部访问。我们将上下文长度探测到大型预训练的语言模型中，并提供一些初步分析和见解，包括研究长期依赖性的潜力。可以使用该方法的源代码和交互式演示。

The increasingly widespread adoption of large language models has highlighted the need for improving their explainability. We present context length probing, a novel explanation technique for causal language models, based on tracking the predictions of a model as a function of the length of available context, and allowing to assign differential importance scores to different contexts. The technique is model-agnostic and does not rely on access to model internals beyond computing token-level probabilities. We apply context length probing to large pre-trained language models and offer some initial analyses and insights, including the potential for studying long-range dependencies. The source code and an interactive demo of the method are available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题