NLP的最新概括研究：分类和审查

论文标题

NLP的最新概括研究：分类和审查

State-of-the-art generalisation research in NLP: A taxonomy and review

论文作者

Hupkes, Dieuwke, Giulianelli, Mario, Dankers, Verna, Artetxe, Mikel, Elazar, Yanai, Pimentel, Tiago, Christodoulopoulos, Christos, Lasri, Karim, Saphra, Naomi, Sinclair, Arabella, Ulmer, Dennis, Schottmann, Florian, Batsuren, Khuyagbaatar, Sun, Kaiser, Sinha, Koustuv, Khalatbari, Leila, Ryskina, Maria, Frieske, Rita, Cotterell, Ryan, Jin, Zhijing

论文摘要

良好概括的能力是自然语言处理（NLP）的主要逃避者之一。但是，“良好的概括”需要什么以及应如何评估它是不充分理解的，也没有任何评估标准进行概括。在本文中，我们为解决这两个问题奠定了基础。我们提出了表征和理解NLP中的概括研究的分类学。我们的分类法是基于对概括研究的广泛文献综述，并包含五个轴线，研究可能会有所不同：它们的主要动机，他们研究的概括类型，他们考虑的数据转移的类型，该数据移动的来源以及建模管道内移动的位置。我们使用分类法对超过400篇论文进行了测试概括，总共有600多个单独的实验。考虑到这篇综述的结果，我们提出了深入的分析，该分析绘制了NLP中泛化研究的当前状态，并提出建议将来可能受到哪些领域的关注。除本文外，我们发布了一个网页，可以在其中动态探索我们的审查结果，并打算随着新的NLP泛化研究的发布。通过这项工作，我们旨在采取步骤进行最新的概括测试NLP中的新现状。

The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what 'good generalisation' entails and how it should be evaluated is not well understood, nor are there any evaluation standards for generalisation. In this paper, we lay the groundwork to address both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they investigate, the type of data shift they consider, the source of this data shift, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis that maps out the current state of generalisation research in NLP, and we make recommendations for which areas might deserve attention in the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to update as new NLP generalisation studies are published. With this work, we aim to take steps towards making state-of-the-art generalisation testing the new status quo in NLP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题