论文标题

LASAFT:条件源分离的潜在源专注于频率转换

LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation

论文作者

Choi, Woosung, Kim, Minseok, Chung, Jaehwa, Jung, Soonyoung

论文摘要

最近的深度学习方法表明,频率转换(FT)块可以通过捕获频率模式来显着改善基于频谱图的单源分离模型。本文的目的是扩展FT块以适合多源任务。我们提出了潜在的源专注于频率变换(LASAFT)块,以捕获源依赖性频率模式。我们还提出了封闭的点卷积调制(GPOCM),即功能线性调制(膜)的扩展,以调节内部特征。通过采用这两种新方法,我们扩展了有条件的U-NET(CUNET)进行多源分离,并且实验结果表明,我们的LASAFT和GPOCM可以改善Cunet的性能,从而在几个MUSDB18源分离任务上实现最先进的SDR SDR性能。

Recent deep-learning approaches have shown that Frequency Transformation (FT) blocks can significantly improve spectrogram-based single-source separation models by capturing frequency patterns. The goal of this paper is to extend the FT block to fit the multi-source task. We propose the Latent Source Attentive Frequency Transformation (LaSAFT) block to capture source-dependent frequency patterns. We also propose the Gated Point-wise Convolutional Modulation (GPoCM), an extension of Feature-wise Linear Modulation (FiLM), to modulate internal features. By employing these two novel methods, we extend the Conditioned-U-Net (CUNet) for multi-source separation, and the experimental results indicate that our LaSAFT and GPoCM can improve the CUNet's performance, achieving state-of-the-art SDR performance on several MUSDB18 source separation tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源