回收单一源的双耳覆盖的肛门预先训练的语音分离深神经网络

论文标题

回收单一源的双耳覆盖的肛门预先训练的语音分离深神经网络

Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source

论文作者

Gul, Sania, Khan, Muhammad Salman, Shah, Syed Waqar, Ur-Rehman, Ata

论文摘要

混响导致正常听众和听力受损的听众的清晰度降低。本文通过回收预先训练的双耳语音分离神经网络来介绍了单个语音源的整流覆盖的新型心理声学方法。由于训练深度神经网络（DNN）是一个漫长且计算昂贵的过程，因此使用预训练的分离网络进行验证的优势是，不需要重新培训网络，从而节省了时间和计算资源。回响来源的室内提示是给予这个经过验证的神经网络的，以区分直接路径信号和回响语音。结果表明，信号清晰度的平均提高1.3％，SRMR的0.83 dB（信号与混响能量比）和0.16点语音质量评估（PESQ）的平均确定点比其他先进的信号处理的言语评估（PESQ），而在其他先进的信号处理中，在可清除率和0.35点的质量中，spectrals speptrals persecter（spectrys spectrals cutserrals cutserral cutterrys）（算法。

Reverberation results in reduced intelligibility for both normal and hearing-impaired listeners. This paper presents a novel psychoacoustic approach of dereverberation of a single speech source by recycling a pre-trained binaural anechoic speech separation neural network. As training the deep neural network (DNN) is a lengthy and computationally expensive process, the advantage of using a pre-trained separation network for dereverberation is that the network does not need to be retrained, saving both time and computational resources. The interaural cues of a reverberant source are given to this pretrained neural network to discriminate between the direct path signal and the reverberant speech. The results show an average improvement of 1.3% in signal intelligibility, 0.83 dB in SRMR (signal to reverberation energy ratio) and 0.16 points in perceptual evaluation of speech quality (PESQ) over other state-of-the-art signal processing dereverberation algorithms and 14% in intelligibility and 0.35 points in quality over orthogonal matching pursuit with spectral subtraction (OSS), a machine learning based dereverberation algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题