论文标题
基于聚类的生物医学实体链接推断
Clustering-based Inference for Biomedical Entity Linking
论文作者
论文摘要
由于生物医学知识库中的大量实体,只有一小部分实体具有相应的标记培训数据。这需要实体链接模型,这些模型能够使用实体的学会来链接看不见的实体。先前的方法独立链接每个提及,忽略了实体提及之间的文档内部和跨文档之间的关系。这些关系对于在生物医学文本中提及的链接可能非常有用,在这种情况下,链接决策通常很难提及具有通用或高度专业的形式。在本文中,我们介绍了一个模型,在该模型中,可以通过链接到知识库实体,还可以通过聚类并共同进行链接预测来做出链接决策。在有关最大公开可用的生物医学数据集的实验中,我们将链接的实体链接的最佳独立预测提高了3.0点的准确性,而我们的基于聚类的推理模型则进一步提高了实体链接,将实体链接提高了2.3点。
Due to large number of entities in biomedical knowledge bases, only a small fraction of entities have corresponding labelled training data. This necessitates entity linking models which are able to link mentions of unseen entities using learned representations of entities. Previous approaches link each mention independently, ignoring the relationships within and across documents between the entity mentions. These relations can be very useful for linking mentions in biomedical text where linking decisions are often difficult due mentions having a generic or a highly specialized form. In this paper, we introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions. In experiments on the largest publicly available biomedical dataset, we improve the best independent prediction for entity linking by 3.0 points of accuracy, and our clustering-based inference model further improves entity linking by 2.3 points.