实时，通用和强大的对抗性攻击说话者识别系统

论文标题

实时，通用和强大的对抗性攻击说话者识别系统

Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

论文作者

Xie, Yi, Shi, Cong, Li, Zhuohang, Liu, Jian, Chen, Yingying, Yuan, Bo

论文摘要

随着近年来语音用户界面（VUI）的普及，说话者识别系统已成为在许多安全要求的应用程序和服务中识别说话者的重要媒介。在本文中，我们提出了第一个实时，通用和强大的对抗性攻击，以针对基于最先进的深度神经网络（DNN）的说话者识别系统。通过在任意注册的说话者的语音输入上添加音频不足的通用扰动，基于DNN的说话者识别系统将识别说话者是任何目标（即对手）的扬声器标签。此外，我们通过估计房间冲动响应（RIR）来建模由物理过电流传播引起的声音扭曲来提高攻击的鲁棒性。使用109位英语的公共数据集的实验证明了我们拟议攻击的有效性和鲁棒性，高攻击成功率超过90％。攻击发射时间还可以实现100倍的速度，超过了当代的非宇宙攻击。

As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker's voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired) speaker label. In addition, we improve the robustness of our attack by modeling the sound distortions caused by the physical over-the-air propagation through estimating room impulse response (RIR). Experiment using a public dataset of 109 English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%. The attack launching time also achieves a 100X speedup over contemporary non-universal attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题