论文标题
DARE:与GPT-2的数据增强关系提取
DARE: Data Augmented Relation Extraction with GPT-2
论文作者
论文摘要
由于培训数据有限或班级失衡问题,现实世界中的关系提取(RE)任务要么具有挑战性。在这项工作中,我们提出了数据增强关系提取(DARE),这是一种通过正确调整GPT-2来生成特定关系类型的示例来增强培训数据的简单方法。然后将生成的培训数据与金数据集结合使用,以培训基于BERT的RE分类器。在一系列实验中,我们显示了我们方法的优势,这导致了相对于强大的基线提高11 f1得分点。此外,DARE在三个广泛使用的生物医学RE数据集中达到了新的最新技术,平均超过了4.7 F1点的最佳结果。
Real-world Relation Extraction (RE) tasks are challenging to deal with, either due to limited training data or class imbalance issues. In this work, we present Data Augmented Relation Extraction(DARE), a simple method to augment training data by properly fine-tuning GPT-2 to generate examples for specific relation types. The generated training data is then used in combination with the gold dataset to train a BERT-based RE classifier. In a series of experiments we show the advantages of our method, which leads in improvements of up to 11 F1 score points against a strong base-line. Also, DARE achieves new state of the art in three widely used biomedical RE datasets surpassing the previous best results by 4.7 F1 points on average.