对话临床访谈的演讲者角色认可和演讲者入学方案的比较

论文标题

对话临床访谈的演讲者角色认可和演讲者入学方案的比较

Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews

论文作者

Riad, Rachid, Titeux, Hadrien, Lemoine, Laurie, Montillot, Justine, Sliwinski, Agnes, Bagnou, Jennifer Hamet, Cao, Xuan Nga, Bachoud-Lévi, Anne-Catherine, Dupoux, Emmanuel

论文摘要

在自然条件下，临床医生和患者之间的对话是医疗随访的宝贵信息来源。对这些对话的自动分析可以帮助提取新的语言标记并加快临床医生的报告。但是，尚不清楚哪种语音处理管道是检测和识别说话者转弯的性能最高的，尤其是对于言语和语言障碍的人。在这里，我们提出了一系列数据，允许对说话者角色识别和说话者入学方法进行比较评估以解决此任务。我们培训了端到端的神经网络体系结构，以适应每个任务，并根据同一指标评估每种方法。在亨廷顿氏病的不同阶段，神经心理学家和受访者之间的自然临床对话报告了实验结果。我们发现我们的演讲者角色识别模型提供了最佳表现。此外，我们的研究强调了使用内域数据重新训练模型的重要性。最后，我们观察到结果不取决于受访者的人口统计信息，从而强调了我们方法的临床相关性。

Conversations between a clinician and a patient, in natural conditions, are valuable sources of information for medical follow-up. The automatic analysis of these dialogues could help extract new language markers and speed-up the clinicians' reports. Yet, it is not clear which speech processing pipeline is the most performing to detect and identify the speaker turns, especially for individuals with speech and language disorders. Here, we proposed a split of the data that allows conducting a comparative evaluation of speaker role recognition and speaker enrollment methods to solve this task. We trained end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric. Experimental results are reported on naturalistic clinical conversations between Neuropsychologist and Interviewees, at different stages of Huntington's disease. We found that our Speaker Role Recognition model gave the best performances. In addition, our study underlined the importance of retraining models with in-domain data. Finally, we observed that results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题