论文标题
针对深神经网络中有针对性的黑盒攻击的无衍生优化算法的实证研究
An Empirical Study of Derivative-Free-Optimization Algorithms for Targeted Black-Box Attacks in Deep Neural Networks
论文作者
论文摘要
我们对在深神经网络(DNN)分类器上产生有针对性的黑盒对抗攻击的性能(DFO)算法的性能进行了全面研究,假设扰动能量受$ \ ell_ \ infty $约束和网络的Queries的限制,则限制了扰动能量的限制。本文考虑了四种先前存在的基于DFO的最新算法,以及引入基于模型的DFO方法Bobyqa构建的新算法。我们根据图像的比例比较了这些算法在各种设置中的比较,因为图像的比例将它们成功分类为DNN的最大查询数量。 实验揭示了寻找对抗示例的可能性如何取决于所使用的算法和攻击的设置。算法将搜索对抗示例的搜索限制为$ \ ell^\ infty $约束的顶点,而没有结构防御,而基于BobyQA的算法则可以更好地效果,特别是对于小扰动能量。这种性能方面的这种差异突出了新算法与各种环境中最新算法相比的重要性,以及对对抗性防御的有效性,使用尽可能多的一系列算法进行测试。
We perform a comprehensive study on the performance of derivative free optimization (DFO) algorithms for the generation of targeted black-box adversarial attacks on Deep Neural Network (DNN) classifiers assuming the perturbation energy is bounded by an $\ell_\infty$ constraint and the number of queries to the network is limited. This paper considers four pre-existing state-of-the-art DFO-based algorithms along with the introduction of a new algorithm built on BOBYQA, a model-based DFO method. We compare these algorithms in a variety of settings according to the fraction of images that they successfully misclassify given a maximum number of queries to the DNN. The experiments disclose how the likelihood of finding an adversarial example depends on both the algorithm used and the setting of the attack; algorithms limiting the search of adversarial example to the vertices of the $\ell^\infty$ constraint work particularly well without structural defenses, while the presented BOBYQA based algorithm works better for especially small perturbation energies. This variance in performance highlights the importance of new algorithms being compared to the state-of-the-art in a variety of settings, and the effectiveness of adversarial defenses being tested using as wide a range of algorithms as possible.