用银标准数据学习以零射击关系提取

论文标题

用银标准数据学习以零射击关系提取

Learning with Silver Standard Data for Zero-shot Relation Extraction

论文作者

Wang, Tianyin, Wang, Jianwei, Zeng, Ziqian

论文摘要

监督关系提取（RE）方法的出色性能在很大程度上取决于大量的黄金标准数据。最近的零射击关系提取方法将RE任务转换为其他NLP任务，并使用了这些NLP任务的现成模型直接对测试数据进行推断，而无需使用大量RE注释数据。这些方法的潜在有价值的副产品是大规模的银标准数据。但是，没有关于使用潜在有价值的银标准数据的进一步研究。在本文中，我们提议首先从银标准数据中检测出少量的干净数据，然后使用选定的干净数据来验证预验证的模型。然后，我们使用鉴定模型来推断关系类型。我们还提出了一个班级感知的清洁数据检测模块，以在选择干净的数据时考虑类信息。实验结果表明，在零弹药任务中，在Tacred和Wiki80数据集上，我们的方法的表现可以优于基线12％和11％。通过使用不同分布的额外的银标准数据，可以进一步提高性能。

The superior performance of supervised relation extraction (RE) methods heavily relies on a large amount of gold standard data. Recent zero-shot relation extraction methods converted the RE task to other NLP tasks and used off-the-shelf models of these NLP tasks to directly perform inference on the test data without using a large amount of RE annotation data. A potentially valuable by-product of these methods is the large-scale silver standard data. However, there is no further investigation on the use of potentially valuable silver standard data. In this paper, we propose to first detect a small amount of clean data from silver standard data and then use the selected clean data to finetune the pretrained model. We then use the finetuned model to infer relation types. We also propose a class-aware clean data detection module to consider class information when selecting clean data. The experimental results show that our method can outperform the baseline by 12% and 11% on TACRED and Wiki80 dataset in the zero-shot RE task. By using extra silver standard data of different distributions, the performance can be further improved.

下载PDF全文

下载文献需遵守相关版权规定

论文标题