使用变异自动编码器进行回归，在支持学习的网络物理系统中检测对抗性示例

论文标题

使用变异自动编码器进行回归，在支持学习的网络物理系统中检测对抗性示例

Detecting Adversarial Examples in Learning-Enabled Cyber-Physical Systems using Variational Autoencoder for Regression

论文作者

Cai, Feiyang, Li, Jiani, Koutsoukos, Xenofon

论文摘要

支持学习的组件（LEC）广泛用于网络物理系统（CPS），因为它们可以处理环境的不确定性和可变性并提高自治水平。但是，已经表明，诸如深神经网络（DNN）之类的LEC并不强大，对抗性示例可能会导致模型做出错误的预测。本文考虑了在CPS中用于回归的LEC中有效检测对抗示例的问题。所提出的方法基于归纳性共形预测，并使用基于变异自动编码器的回归模型。该体系结构允许考虑输入和神经网络预测，以检测对抗性和更普遍的分布示例。我们使用在开源模拟器中实现的自动驾驶汽车中实现的高级紧急制动系统演示了该方法，在该模拟器中使用DNN估算到障碍物的距离。仿真结果表明，该方法可以有效地检测出短的检测延迟的对抗示例。

Learning-enabled components (LECs) are widely used in cyber-physical systems (CPS) since they can handle the uncertainty and variability of the environment and increase the level of autonomy. However, it has been shown that LECs such as deep neural networks (DNN) are not robust and adversarial examples can cause the model to make a false prediction. The paper considers the problem of efficiently detecting adversarial examples in LECs used for regression in CPS. The proposed approach is based on inductive conformal prediction and uses a regression model based on variational autoencoder. The architecture allows to take into consideration both the input and the neural network prediction for detecting adversarial, and more generally, out-of-distribution examples. We demonstrate the method using an advanced emergency braking system implemented in an open source simulator for self-driving cars where a DNN is used to estimate the distance to an obstacle. The simulation results show that the method can effectively detect adversarial examples with a short detection delay.

下载PDF全文

下载文献需遵守相关版权规定

论文标题