论文标题
关于基于RNN的在线神经语音分离系统的设计和培训策略
On the Design and Training Strategies for RNN-based Online Neural Speech Separation Systems
论文作者
论文摘要
尽管新型神经网络体系结构的最新发展,离线神经语音分离系统的性能大大提高了,但系统及其在线变体之间通常存在不可避免的性能差距。在本文中,我们研究了如何在减轻性能降解的同时将基于RNN的离线神经语音分离系统更改为在线对应物。我们在双向RNN层中分解或重组向后和向后的RNN层以形成在线路径和离线路径,这使该模型能够使用相同的模型参数同时执行在线和离线处理。我们进一步介绍了两种培训策略,以通过验证的离线模型或多任务培训目标来改善在线模型。实验结果表明,与从头开始训练的在线模型相比,提议的层分解和重组方案和培训策略可以有效地减轻两个基于RNN的离线分离模型及其在线变体之间的性能差距。
While the performance of offline neural speech separation systems has been greatly advanced by the recent development of novel neural network architectures, there is typically an inevitable performance gap between the systems and their online variants. In this paper, we investigate how RNN-based offline neural speech separation systems can be changed into their online counterparts while mitigating the performance degradation. We decompose or reorganize the forward and backward RNN layers in a bidirectional RNN layer to form an online path and an offline path, which enables the model to perform both online and offline processing with a same set of model parameters. We further introduce two training strategies for improving the online model via either a pretrained offline model or a multitask training objective. Experiment results show that compared to the online models that are trained from scratch, the proposed layer decomposition and reorganization schemes and training strategies can effectively mitigate the performance gap between two RNN-based offline separation models and their online variants.