论文标题

阿拉伯语中的零和非零件的联合核心分辨率

Joint Coreference Resolution for Zeros and non-Zeros in Arabic

论文作者

Aloraini, Abdulrahman, Pradhan, Sameer, Poesio, Massimo

论文摘要

关于预示零代词(AZP)解决方案的大多数现有建议将全面提及的核心和AZP分辨率视为两个独立任务,即使这两个任务显然是相关的。需要解决以开发零和非零提及的联合模型的主要问题是两种类型的参数(零代词,无效,不提供名义信息),并且缺乏合适尺寸的带注释的数据集,其中两种类型的参数都针对中国和日语以外的其他语言注释。在本文中,我们介绍了两种架构,用于共同解决AZP和非ARZP,并在阿拉伯语上对其进行评估,据我们所知,这种语言尚未在共同解决方案上进行过以前的工作。这样做还需要创建用于Conll-2012共享任务的标准Coreference分辨率数据集的阿拉伯语子集的新版本(Pradhan等,2012),其中零和非零件都包含在单个数据集中。

Most existing proposals about anaphoric zero pronoun (AZP) resolution regard full mention coreference and AZP resolution as two independent tasks, even though the two tasks are clearly related. The main issues that need tackling to develop a joint model for zero and non-zero mentions are the difference between the two types of arguments (zero pronouns, being null, provide no nominal information) and the lack of annotated datasets of a suitable size in which both types of arguments are annotated for languages other than Chinese and Japanese. In this paper, we introduce two architectures for jointly resolving AZPs and non-AZPs, and evaluate them on Arabic, a language for which, as far as we know, there has been no prior work on joint resolution. Doing this also required creating a new version of the Arabic subset of the standard coreference resolution dataset used for the CoNLL-2012 shared task (Pradhan et al.,2012) in which both zeros and non-zeros are included in a single dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源