论文标题

BC-VAD:强大的骨传导语音活动检测

BC-VAD: A Robust Bone Conduction Voice Activity Detection

论文作者

Polvani, Niccolo', Ronssin, Damien, Cernak, Milos

论文摘要

语音活动检测(VAD)是许多音频应用中的基本模块。最近的最先进的VAD系统通常基于神经网络,但是它们需要一个计算预算,通常在保留较大型号的性能时,通常会超过小电池操作设备的功能。在这项工作中,我们依靠骨传导麦克风(BCM)的输入来设计有效的VAD(BC-VAD),可与源自环境的残留非平台噪声或不戴BCM的扬声器的残留非平稳噪声,我们首先显示出更大的VAD系统(58k参数)在公共场所取得了成功,但在bone上取得了成功,但在bone上取得了成功,但在bone上均可在bone上conders bone,但在bone上均可在bone上进行bone bone bone bone bone。然后,我们将其变体BC-VAD(5K参数并在BC数据上进行培训)与专门为BCM设计的基线进行比较,并表明所提出的方法在各种指标下实现了更好的性能,同时保留了微控制器的实时处理需求。

Voice Activity Detection (VAD) is a fundamental module in many audio applications. Recent state-of-the-art VAD systems are often based on neural networks, but they require a computational budget that usually exceeds the capabilities of a small battery-operated device when preserving the performance of larger models. In this work, we rely on the input from a bone conduction microphone (BCM) to design an efficient VAD (BC-VAD) robust against residual non-stationary noises originating from the environment or speakers not wearing the BCM.We first show that a larger VAD system (58k parameters) achieves state-of-the-art results on a publicly available benchmark but fails when running on bone conduction signals. We then compare its variant BC-VAD (5k parameters and trained on BC data) with a baseline especially designed for a BCM and show that the proposed method achieves better performances under various metrics while keeping the realtime processing requirement for a microcontroller.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源