论文标题

紧凑的令牌表示具有上下文量化以重新排序的有效文档

Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking

论文作者

Yang, Yingrui, Qiao, Yifan, Yang, Tao

论文摘要

基于变压器的重新排列模型可以通过将查询令牌与文档令牌与查询代币的软匹配实现高搜索相关性。为了减轻这种推理的运行时复杂性,以前的工作采用了较晚的互动体系结构,并以大量的在线存储为代价,并具有预计的上下文令牌表示。本文提出了通过在基于代码书的压缩过程中解除特定于文档和独立于文档的排名贡献的令牌嵌入的上下文量化。这允许有效的在线减压和嵌入组成,以更好地搜索相关性。本文在相关性和空间效率方面介绍了上述紧凑的代币表示模型的评估。

Transformer based re-ranking models can achieve high search relevance through context-aware soft matching of query tokens with document tokens. To alleviate runtime complexity of such inference, previous work has adopted a late interaction architecture with pre-computed contextual token representations at the cost of a large online storage. This paper proposes contextual quantization of token embeddings by decoupling document-specific and document-independent ranking contributions during codebook-based compression. This allows effective online decompression and embedding composition for better search relevance. This paper presents an evaluation of the above compact token representation model in terms of relevance and space efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源