变压器内存作为可区分搜索索引

论文标题

变压器内存作为可区分搜索索引

Transformer Memory as a Differentiable Search Index

论文作者

Tay, Yi, Tran, Vinh Q., Dehghani, Mostafa, Ni, Jianmo, Bahri, Dara, Mehta, Harsh, Qin, Zhen, Hui, Kai, Zhao, Zhe, Gupta, Jai, Schuster, Tal, Cohen, William W., Metzler, Donald

论文摘要

在本文中，我们证明可以通过单个变压器完成信息检索，其中所有有关语料库的信息均在模型的参数中编码。为此，我们介绍了可区分搜索索引（DSI），这是一种新的范式，该范式学习了一个文本到文本模型，该模型将字符串直接查询到相关的文档；换句话说，DSI模型仅使用其参数直接回答查询，从而极大地简化了整个检索过程。我们研究了如何代表文档及其标识符的变化，培训程序的变化以及模型和语料库大小之间的相互作用。实验表明，在适当的设计选择下，DSI明显优于强大的基线，例如双编码器模型。此外，DSI表现出强大的概括能力，在零弹位设置中表现优于BM25基线。

In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model. To this end, we introduce the Differentiable Search Index (DSI), a new paradigm that learns a text-to-text model that maps string queries directly to relevant docids; in other words, a DSI model answers queries directly using only its parameters, dramatically simplifying the whole retrieval process. We study variations in how documents and their identifiers are represented, variations in training procedures, and the interplay between models and corpus sizes. Experiments demonstrate that given appropriate design choices, DSI significantly outperforms strong baselines such as dual encoder models. Moreover, DSI demonstrates strong generalization capabilities, outperforming a BM25 baseline in a zero-shot setup.

下载PDF全文

下载文献需遵守相关版权规定

论文标题