论文标题

基于及时的多层次医学对话状态跟踪的生成方法

Prompt-based Generative Approach towards Multi-Hierarchical Medical Dialogue State Tracking

论文作者

Liu, Jun, Ruan, Tong, Wang, Haofen, Zhang, Huanhuan

论文摘要

医疗对话系统是一个有前途的应用程序,可以为患者提供极大的便利。医学对话系统中的对话状态跟踪(DST)模块将话语解释为下游任务的机器可读结构特别具有挑战性。首先,各州需要能够代表具有身体部位的症状等复合实体或严重程度程度的疾病,以提供足够的信息以进行决策支持。其次,这些话语中的这些命名实体可能是不连续的,并且分散在句子和说话者之间。这些也使很难注释大多数方法必不可少的大型语料库。因此,我们首先定义多层结构。我们用中文注释并发布医学对话数据集。据我们所知,以前没有公开可用的东西。然后,我们提出了一种基于及时的生成方法,该方法可以使用自上而下的方法以逐渐逐步生成插槽值。对话样式提示还补充了使用大型未标记的对话语料库来减轻数据稀缺问题。实验表明,我们的方法的表现优于其他DST方法,并且在情况下很少有数据。

The medical dialogue system is a promising application that can provide great convenience for patients. The dialogue state tracking (DST) module in the medical dialogue system which interprets utterances into the machine-readable structure for downstream tasks is particularly challenging. Firstly, the states need to be able to represent compound entities such as symptoms with their body part or diseases with degrees of severity to provide enough information for decision support. Secondly, these named entities in the utterance might be discontinuous and scattered across sentences and speakers. These also make it difficult to annotate a large corpus which is essential for most methods. Therefore, we first define a multi-hierarchical state structure. We annotate and publish a medical dialogue dataset in Chinese. To the best of our knowledge, there are no publicly available ones before. Then we propose a Prompt-based Generative Approach which can generate slot values with multi-hierarchies incrementally using a top-down approach. A dialogue style prompt is also supplemented to utilize the large unlabeled dialogue corpus to alleviate the data scarcity problem. The experiments show that our approach outperforms other DST methods and is rather effective in the scenario with little data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源