论文标题
OSLAT:用于医疗实体检索和跨度提取的开放式标签注意变压器
OSLAT: Open Set Label Attention Transformer for Medical Entity Retrieval and Span Extraction
论文作者
论文摘要
医疗实体跨度提取和链接是许多医疗保健NLP任务的关键步骤。大多数现有实体提取方法要么具有固定的医疗实体词汇,要么需要跨度注释。在本文中,我们提出了一种链接不需要任何跨度注释的开放式实体集的方法。我们的方法是开放标签注意变压器(OSLAT),使用标签 - 注意机制来学习候选 - 实体上下文化文本表示。我们发现,奥斯拉特不仅可以链接实体,还可以隐式学习与实体相关的跨度。我们在两个任务上评估OSLAT:(1)跨越未经显式注释的跨度提取,以及(2)链接的实体链接未经跨度级注释的训练。我们通过在两个具有低实体重叠的数据集上训练两个单独的模型来测试方法的普遍性,并比较跨数据库的性能。
Medical entity span extraction and linking are critical steps for many healthcare NLP tasks. Most existing entity extraction methods either have a fixed vocabulary of medical entities or require span annotations. In this paper, we propose a method for linking an open set of entities that does not require any span annotations. Our method, Open Set Label Attention Transformer (OSLAT), uses the label-attention mechanism to learn candidate-entity contextualized text representations. We find that OSLAT can not only link entities but is also able to implicitly learn spans associated with entities. We evaluate OSLAT on two tasks: (1) span extraction trained without explicit span annotations, and (2) entity linking trained without span-level annotation. We test the generalizability of our method by training two separate models on two datasets with low entity overlap and comparing cross-dataset performance.