波兰自然语言推论和事实 - 基于专家的数据集和基准

论文标题

波兰自然语言推论和事实 - 基于专家的数据集和基准

Polish Natural Language Inference and Factivity -- an Expert-based Dataset and Benchmarks

论文作者

Ziembicki, Daniel, Wróblewska, Anna, Seweryn, Karolina

论文摘要

尽管最近在机器学习的自然语言处理方面取得了突破，但自然语言推断（NLI）问题仍然构成挑战。为此，我们贡献了一个新的数据集，该数据集仅专注于事实现象。但是，我们的任务仍然与其他NLI任务相同，即对需要的预测，矛盾或中立（ECN）。该数据集包含波兰语中的完全自然语言，并收集了2,432个动词 - 容器对和309个独特的动词。该数据集基于全国性的波兰语（NKJP），并且是主要动词和其他语言特征（例如，内部否定的发生）的代表性样本。我们发现，基于变压器BERT的模型可在句子上工作，获得了相对较好的结果（$ \ yout89 \％$ f1得分）。即使使用语言功能（$ \ of 91 \％$ f1得分）取得了更好的结果，但该模型需要更多的人工（循环中的人类），因为专家语言学家手动准备了功能。基于BERT的模型仅消耗输入句子，表明它们捕获了NLI/Factivity的大部分复杂性。现象中的复杂情况 - 例如具有权利（E）和非事实动词的病例 - 仍然是进一步研究的空缺问题。

Despite recent breakthroughs in Machine Learning for Natural Language Processing, the Natural Language Inference (NLI) problems still constitute a challenge. To this purpose we contribute a new dataset that focuses exclusively on the factivity phenomenon; however, our task remains the same as other NLI tasks, i.e. prediction of entailment, contradiction or neutral (ECN). The dataset contains entirely natural language utterances in Polish and gathers 2,432 verb-complement pairs and 309 unique verbs. The dataset is based on the National Corpus of Polish (NKJP) and is a representative sample in regards to frequency of main verbs and other linguistic features (e.g. occurrence of internal negation). We found that transformer BERT-based models working on sentences obtained relatively good results ($\approx89\%$ F1 score). Even though better results were achieved using linguistic features ($\approx91\%$ F1 score), this model requires more human labour (humans in the loop) because features were prepared manually by expert linguists. BERT-based models consuming only the input sentences show that they capture most of the complexity of NLI/factivity. Complex cases in the phenomenon - e.g. cases with entitlement (E) and non-factive verbs - remain an open issue for further research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题