雾化搜索长度：超出用户模型

论文标题

雾化搜索长度：超出用户模型

Atomized Search Length: Beyond User Models

论文作者

Alex, John, Hall, Keith, Metzler, Donald

论文摘要

我们认为，当前的IR指标以优化用户体验，测量一部分IR空间的狭窄范围。如果IR系统较弱，这些指标会未完成或完全滤除需要改进的更深层次的文档。如果IR系统相对较强，这些指标可以调解更深入的相关文档，这些文档可能会支持更强的IR系统，即可能在用户知识层次结构或文本摘要中显示数十个或数百个相关文档的内容。在过去的28年中，我们重新分析了70多个TREC曲目，这表明大约一半的最终排名最高的文档和几乎所有的尾巴文档。我们表明，在2020年的深度学习轨道中，神经系统实际上在排名最高的文档中几乎是最佳的，而尾巴文档上BM25的收益仅适中。我们的分析基于一个简单的面向系统的指标“雾化搜索长度”，该指标能够准确，均匀地测量任何深度的所有相关文档。

We argue that current IR metrics, modeled on optimizing user experience, measure too narrow a portion of the IR space. If IR systems are weak, these metrics undersample or completely filter out the deeper documents that need improvement. If IR systems are relatively strong, these metrics undersample deeper relevant documents that could underpin even stronger IR systems, ones that could present content from tens or hundreds of relevant documents in a user-digestible hierarchy or text summary. We reanalyze over 70 TREC tracks from the past 28 years, showing that roughly half undersample top ranked documents and nearly all undersample tail documents. We show that in the 2020 Deep Learning tracks, neural systems were actually near-optimal at top-ranked documents, compared to only modest gains over BM25 on tail documents. Our analysis is based on a simple new systems-oriented metric, 'atomized search length', which is capable of accurately and evenly measuring all relevant documents at any depth.

下载PDF全文

下载文献需遵守相关版权规定

论文标题