论文标题
BC-VAD:强大的骨传导语音活动检测
BC-VAD: A Robust Bone Conduction Voice Activity Detection
论文作者
论文摘要
语音活动检测(VAD)是许多音频应用中的基本模块。最近的最先进的VAD系统通常基于神经网络,但是它们需要一个计算预算,通常在保留较大型号的性能时,通常会超过小电池操作设备的功能。在这项工作中,我们依靠骨传导麦克风(BCM)的输入来设计有效的VAD(BC-VAD),可与源自环境的残留非平台噪声或不戴BCM的扬声器的残留非平稳噪声,我们首先显示出更大的VAD系统(58k参数)在公共场所取得了成功,但在bone上取得了成功,但在bone上取得了成功,但在bone上均可在bone上conders bone,但在bone上均可在bone上进行bone bone bone bone bone。然后,我们将其变体BC-VAD(5K参数并在BC数据上进行培训)与专门为BCM设计的基线进行比较,并表明所提出的方法在各种指标下实现了更好的性能,同时保留了微控制器的实时处理需求。
Voice Activity Detection (VAD) is a fundamental module in many audio applications. Recent state-of-the-art VAD systems are often based on neural networks, but they require a computational budget that usually exceeds the capabilities of a small battery-operated device when preserving the performance of larger models. In this work, we rely on the input from a bone conduction microphone (BCM) to design an efficient VAD (BC-VAD) robust against residual non-stationary noises originating from the environment or speakers not wearing the BCM.We first show that a larger VAD system (58k parameters) achieves state-of-the-art results on a publicly available benchmark but fails when running on bone conduction signals. We then compare its variant BC-VAD (5k parameters and trained on BC data) with a baseline especially designed for a BCM and show that the proposed method achieves better performances under various metrics while keeping the realtime processing requirement for a microcontroller.