段级度量学习，用于几次生物声学事件检测

论文标题

段级度量学习，用于几次生物声学事件检测

Segment-level Metric Learning for Few-shot Bioacoustic Event Detection

论文作者

Liu, Haohe, Liu, Xubo, Mei, Xinhao, Kong, Qiuqiang, Wang, Wenwu, Plumbley, Mark D.

论文摘要

几个示例，很少有射击的生物声学事件检测是检测新声音的发生时间的任务。先前的方法采用公制学习来建立一个潜在空间，其中包括不同声音类别的标记部分，也称为正面事件。在这项研究中，我们提出了一个细分级的几杆学习框架，该框架在模型优化过程中利用正面和负面事件。负面事件的训练比积极事件大，可以提高模型的概括能力。此外，我们对训练期间的验证集使用转导性推断，以更好地适应新的课程。我们对我们提出的方法进行消融研究，并在输入特征，训练数据和超参数上进行不同的设置。我们的最终系统在DCASE 2022挑战任务5（DCASE2022-T5）验证集上达到了62.73的F量，以优于基线原型网络34.02的性能。使用提出的方法，我们提交的系统在Dcase2022-T5中排名第二。本文的代码在https://github.com/haoheliu/dcase_2022_task_5上完全开源。

Few-shot bioacoustic event detection is a task that detects the occurrence time of a novel sound given a few examples. Previous methods employ metric learning to build a latent space with the labeled part of different sound classes, also known as positive events. In this study, we propose a segment-level few-shot learning framework that utilizes both the positive and negative events during model optimization. Training with negative events, which are larger in volume than positive events, can increase the generalization ability of the model. In addition, we use transductive inference on the validation set during training for better adaptation to novel classes. We conduct ablation studies on our proposed method with different setups on input features, training data, and hyper-parameters. Our final system achieves an F-measure of 62.73 on the DCASE 2022 challenge task 5 (DCASE2022-T5) validation set, outperforming the performance of the baseline prototypical network 34.02 by a large margin. Using the proposed method, our submitted system ranks 2nd in DCASE2022-T5. The code of this paper is fully open-sourced at https://github.com/haoheliu/DCASE_2022_Task_5.

下载PDF全文

下载文献需遵守相关版权规定

论文标题