论文标题
经过认证的培训:您需要的所有盒子
Certified Training: Small Boxes are All You Need
论文作者
论文摘要
为了获得对抗性鲁棒性的确定性保证,使用了专门的训练方法。我们提出了一种新颖的认证训练方法,基于关键见解,即对对抗输入区域的一个少量但精心选择的子集传播间隔界限足以使整个区域中最严重的案例损失近似,同时显着减少了近似误差。我们在广泛的经验评估中表明,SABR在扰动大小和数据集的标准和可认证精度方面都优于现有的认证防御,并指出了一种新的认证培训方法,这些方法有望减轻稳健性 - 准确性权衡权衡。
To obtain, deterministic guarantees of adversarial robustness, specialized training methods are used. We propose, SABR, a novel such certified training method, based on the key insight that propagating interval bounds for a small but carefully selected subset of the adversarial input region is sufficient to approximate the worst-case loss over the whole region while significantly reducing approximation errors. We show in an extensive empirical evaluation that SABR outperforms existing certified defenses in terms of both standard and certifiable accuracies across perturbation magnitudes and datasets, pointing to a new class of certified training methods promising to alleviate the robustness-accuracy trade-off.