地球：黑盒对抗攻击的几何框架

论文标题

地球：黑盒对抗攻击的几何框架

GeoDA: a geometric framework for black-box adversarial attacks

论文作者

Rahmati, Ali, Moosavi-Dezfooli, Seyed-Mohsen, Frossard, Pascal, Dai, Huaiyu

论文摘要

对抗性示例称为欺骗图像分类器的精心扰动图像。我们提出了一个几何框架，以在最具挑战性的黑盒设置之一中生成对抗示例，在该设置中，对手只能生成少量查询，每个查询都返回分类器的顶级$ 1 $标签。我们的框架是基于以下观察结果：深网的决策边界通常在数据样本附近具有较小的平均曲率。我们提出了一种有效的迭代算法，以生成$ p \ ge 1 $的小$ \ ell_p $ norms的查询有效的黑盒扰动，这是通过对最先进的自然图像分类器进行的实验评估来确认的。此外，对于$ p = 2 $，我们从理论上表明，当决策边界的曲率有限时，我们的算法实际上会收敛到最小$ \ ell_2 $ - er_2 $ - 求解。我们还获得了查询在算法的迭代上的最佳分布。最后，实验结果证实，我们原则上的黑盒攻击算法的性能要比最新的算法更好，因为它会产生较小的扰动，并且查询数量减少。

Adversarial examples are known as carefully perturbed images fooling image classifiers. We propose a geometric framework to generate adversarial examples in one of the most challenging black-box settings where the adversary can only generate a small number of queries, each of them returning the top-$1$ label of the classifier. Our framework is based on the observation that the decision boundary of deep networks usually has a small mean curvature in the vicinity of data samples. We propose an effective iterative algorithm to generate query-efficient black-box perturbations with small $\ell_p$ norms for $p \ge 1$, which is confirmed via experimental evaluations on state-of-the-art natural image classifiers. Moreover, for $p=2$, we theoretically show that our algorithm actually converges to the minimal $\ell_2$-perturbation when the curvature of the decision boundary is bounded. We also obtain the optimal distribution of the queries over the iterations of the algorithm. Finally, experimental results confirm that our principled black-box attack algorithm performs better than state-of-the-art algorithms as it generates smaller perturbations with a reduced number of queries.

下载PDF全文

下载文献需遵守相关版权规定

论文标题