论文标题

CURE数据集:音频事件分类的梯子网络

CURE Dataset: Ladder Networks for Audio Event Classification

论文作者

Dubey, Harishchandra, Emmanouilidou, Dimitra, Tashev, Ivan J.

论文摘要

音频事件分类对于多种应用程序(例如监视,音频,视频和多媒体检索等)都是一项重要任务。大约有300万人听力损失的人无法感知周围发生的事件。本文建立了CURE数据集,其中包含与听力损失的人最相关的一组精选的特定音频事件。我们提出了一个基于梯子网络的音频事件分类器,该分类器利用了从Freesound Project得出的5s声音录音。我们采用了最先进的卷积神经网络(CNN)嵌入作为此任务的音频功能。我们还研究了极限学习机(ELM)进行事件分类。在这项研究中,将提出的分类器与支持向量机(SVM)基线进行比较。我们提出信号和功能归一化,旨在减少不同记录方案之间的不匹配。首先,CNN接受了弱标记的音频集数据的培训。接下来,将预训练的模型作为提出的治疗库作为特征提取器。我们将ESC-50数据集合并为第二个评估集。结果和讨论证实了梯子网络优于ELM和SVM分类器的优势,从鲁棒性和提高的分类精度来看。尽管梯子网络对数据不匹配是可靠的,但简单的SVM和ELM分类器对此类不匹配敏感,在这种不匹配的情况下,所提出的标准化技术可以发挥重要作用。 ESC-50和Cure Corpora的实验研究阐明了拟议方法提供的数据集复杂性和鲁棒性的差异。

Audio event classification is an important task for several applications such as surveillance, audio, video and multimedia retrieval etc. There are approximately 3M people with hearing loss who can't perceive events happening around them. This paper establishes the CURE dataset which contains curated set of specific audio events most relevant for people with hearing loss. We propose a ladder network based audio event classifier that utilizes 5s sound recordings derived from the Freesound project. We adopted the state-of-the-art convolutional neural network (CNN) embeddings as audio features for this task. We also investigate extreme learning machine (ELM) for event classification. In this study, proposed classifiers are compared with support vector machine (SVM) baseline. We propose signal and feature normalization that aims to reduce the mismatch between different recordings scenarios. Firstly, CNN is trained on weakly labeled Audioset data. Next, the pre-trained model is adopted as feature extractor for proposed CURE corpus. We incorporate ESC-50 dataset as second evaluation set. Results and discussions validate the superiority of Ladder network over ELM and SVM classifier in terms of robustness and increased classification accuracy. While Ladder network is robust to data mismatches, simpler SVM and ELM classifiers are sensitive to such mismatches, where the proposed normalization techniques can play an important role. Experimental studies with ESC-50 and CURE corpora elucidate the differences in dataset complexity and robustness offered by proposed approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源