论文标题

AudvowWelConsnet:用于临床抑郁诊断的基于音素级的深CNN结构

AudVowelConsNet: A Phoneme-Level Based Deep CNN Architecture for Clinical Depression Diagnosis

论文作者

Muzammel, Muhammad, Salam, Hanan, Hoffmann, Yann, Chetouani, Mohamed, Othmani, Alice

论文摘要

抑郁症是一种常见而严重的情绪障碍,会对患者在日常任务中正常运作的能力产生负面影响。言语被证明是抑郁诊断中的有力工具。精神病学的研究集中于对有助于言语抑郁症表现的单词水平语音成分进行细粒度分析,并揭示了音素级别的抑郁语音差异。另一方面,基于机器学习的自动识别抑郁症的研究重点是探索各种声学特征,以检测抑郁症及其严重程度。很少有人专注于将音素级的语音组件纳入自动评估系统中。在本文中,我们提出了基于人工智能(AI)的临床抑郁识别和言语评估的应用。我们研究了音素单元的声学特征,特别是通过深度学习来识别抑郁症的元音和辅音。我们介绍并比较了三个基于频谱图的深神经网络体系结构,分别对音素辅音和元音单元及其融合进行了训练。我们的实验表明,与元音基于元音相比,基于深度辅音的声学特征可带来更好的识别结果。通过深层网络的元音和辅音语音特征的融合显着优于单个空间网络以及DAIC-WOZ数据库上最先进的深度学习方法。

Depression is a common and serious mood disorder that negatively affects the patient's capacity of functioning normally in daily tasks. Speech is proven to be a vigorous tool in depression diagnosis. Research in psychiatry concentrated on performing fine-grained analysis on word-level speech components contributing to the manifestation of depression in speech and revealed significant variations at the phoneme-level in depressed speech. On the other hand, research in Machine Learning-based automatic recognition of depression from speech focused on the exploration of various acoustic features for the detection of depression and its severity level. Few have focused on incorporating phoneme-level speech components in automatic assessment systems. In this paper, we propose an Artificial Intelligence (AI) based application for clinical depression recognition and assessment from speech. We investigate the acoustic characteristics of phoneme units, specifically vowels and consonants for depression recognition via Deep Learning. We present and compare three spectrogram-based Deep Neural Network architectures, trained on phoneme consonant and vowel units and their fusion respectively. Our experiments show that the deep learned consonant-based acoustic characteristics lead to better recognition results than vowel-based ones. The fusion of vowel and consonant speech characteristics through a deep network significantly outperforms the single space networks as well as the state-of-art deep learning approaches on the DAIC-WOZ database.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源