论文标题

使用声学和韵律功能从电话演讲中检测到呼吸遇险

Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features

论文作者

Rashid, Meemnur, Alman, Kaisar Ahmed, Hasan, Khaled, Hansen, John H. L., Hasan, Taufiq

论文摘要

随着远程医疗服务的广泛使用,通过电话演讲对健康状况进行自动评估会极大地影响公共卫生。这项工作总结了我们使用众所周知的声学和韵律特征自动检测呼吸窘迫的初步发现。语音样本是从孟加拉国医疗保健提供商的去识别的远程医疗拨打电话中收集的。录音包括与患者与医生交谈的患者对话样本样本,表现出轻度或严重的呼吸窘迫或哮喘症状。我们假设呼吸窘迫可能会改变语音特征,例如语音质量,口语模式,响度和语音停止持续时间。为了捕获这些变化,我们利用一组众所周知的声学和韵律特征,具有支持向量机(SVM)分类器来检测呼吸窘迫的存在。实验评估是使用3倍交叉验证方案进行的,以确保与患者无关的数据分裂。我们在使用声学特征集从语音记录中检测出呼吸窘迫的总体精度为86.4 \%。相关分析表明,表现最佳的功能包括响度,语音速率,语音持续时间和暂停持续时间。

With the widespread use of telemedicine services, automatic assessment of health conditions via telephone speech can significantly impact public health. This work summarizes our preliminary findings on automatic detection of respiratory distress using well-known acoustic and prosodic features. Speech samples are collected from de-identified telemedicine phonecalls from a healthcare provider in Bangladesh. The recordings include conversational speech samples of patients talking to doctors showing mild or severe respiratory distress or asthma symptoms. We hypothesize that respiratory distress may alter speech features such as voice quality, speaking pattern, loudness, and speech-pause duration. To capture these variations, we utilize a set of well-known acoustic and prosodic features with a Support Vector Machine (SVM) classifier for detecting the presence of respiratory distress. Experimental evaluations are performed using a 3-fold cross-validation scheme, ensuring patient-independent data splits. We obtained an overall accuracy of 86.4\% in detecting respiratory distress from the speech recordings using the acoustic feature set. Correlation analysis reveals that the top-performing features include loudness, voice rate, voice duration, and pause duration.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源