基于浅层和深度特征表示学习的音频篡改检测

论文标题

基于浅层和深度特征表示学习的音频篡改检测

Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

论文作者

Wang, Zhifeng, Yang, Yao, Zeng, Chunyan, Kong, Shuai, Feng, Shixiong, Zhao, Nan

论文摘要

数字音频篡改检测可用于验证数字音频的真实性。但是，大多数当前方法使用标准电子网络频率（ENF）数据库进行数字音频的ENF连续性的视觉比较分析或通过机器学习方法进行分类的功能提取。 ENF数据库通常很难获取，视觉方法具有较弱的特征表示，并且机器学习方法在功能上具有更多的信息丢失，从而导致检测准确性较低。本文提出了一种浅层和深度特征的融合方法，可以通过利用不同级别的特征的互补性质来完全使用ENF信息，以更准确地描述通过对原始数字音频篡改操作产生的不一致性的变化。该方法在三个经典数据库中达到了97.03％的精度：Carioca 1，Carioca 2和新西班牙语。此外，在新建的数据库Gaudi-Di上，我们的准确度为88.31％。实验结果表明，所提出的方法优于最新方法。

Digital audio tampering detection can be used to verify the authenticity of digital audio. However, most current methods use standard electronic network frequency (ENF) databases for visual comparison analysis of ENF continuity of digital audio or perform feature extraction for classification by machine learning methods. ENF databases are usually tricky to obtain, visual methods have weak feature representation, and machine learning methods have more information loss in features, resulting in low detection accuracy. This paper proposes a fusion method of shallow and deep features to fully use ENF information by exploiting the complementary nature of features at different levels to more accurately describe the changes in inconsistency produced by tampering operations to raw digital audio. The method achieves 97.03% accuracy on three classic databases: Carioca 1, Carioca 2, and New Spanish. In addition, we have achieved an accuracy of 88.31% on the newly constructed database GAUDI-DI. Experimental results show that the proposed method is superior to the state-of-the-art method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题