论文标题
终结者:科学文本处理的系统
TERMinator: A system for scientific texts processing
论文作者
论文摘要
本文致力于从科学文本中提取实体和语义关系,我们将科学术语视为实体。在本文中,我们提出了一个数据集,其中包含两个任务的注释,并开发了一个名为“终结者”的系统,以研究语言模型对术语识别和比较不同方法提取方法的影响。实验表明,在目标语言上预先训练的语言模型并不总是显示出最佳性能。还可以添加一些启发式方法可以提高特定任务的整体质量。开发的工具和注释的语料库可在https://github.com/iis-research-team/terminator上公开获得,可能对其他研究人员有用。
This paper is devoted to the extraction of entities and semantic relations between them from scientific texts, where we consider scientific terms as entities. In this paper, we present a dataset that includes annotations for two tasks and develop a system called TERMinator for the study of the influence of language models on term recognition and comparison of different approaches for relation extraction. Experiments show that language models pre-trained on the target language are not always show the best performance. Also adding some heuristic approaches may improve the overall quality of the particular task. The developed tool and the annotated corpus are publicly available at https://github.com/iis-research-team/terminator and may be useful for other researchers.