论文标题
Shieldnn:不安全的NN控制器的可证明安全的NN滤波器
ShieldNN: A Provably Safe NN Filter for Unsafe NN Controllers
论文作者
论文摘要
在本文中,我们开发了一种新型的封闭形式的控制屏障功能(CBF)和相关的控制器自行车模型(KBM)的控制器屏蔽,以避免障碍物。提议的CBF和SHIELD是由我们称为Shieldnn的算法设计的 - 与现有方法相比提供了两个至关重要的优势。首先,Shieldnn认为直接使用非纳入KBM动力学直接考虑转向和速度约束。这与更通用的方法相反,后者通常仅考虑仿射动力学,并且不能保证在控制约束下的不变性属性。其次,与更通用的方法不同,SHIELDNN为每个状态提供了一组封闭形式的安全控件,通常依靠优化算法来为每个状态生成单个瞬时。这些优势共同使Shieldnn独特地作为一种有效的多孔安全动作(即在启用强化学习(RL)启用神经网络控制器的训练时间期间,多孔安全屏蔽)。我们通过实验表明,在存在多个障碍的情况下,Shieldnn大大提高了RL训练发作的完成率,从而确立了Shieldnn在训练基于RL的控制器中的价值。
In this paper, we develop a novel closed-form Control Barrier Function (CBF) and associated controller shield for the Kinematic Bicycle Model (KBM) with respect to obstacle avoidance. The proposed CBF and shield -- designed by an algorithm we call ShieldNN -- provide two crucial advantages over existing methodologies. First, ShieldNN considers steering and velocity constraints directly with the non-affine KBM dynamics; this is in contrast to more general methods, which typically consider only affine dynamics and do not guarantee invariance properties under control constraints. Second, ShieldNN provides a closed-form set of safe controls for each state unlike more general methods, which typically rely on optimization algorithms to generate a single instantaneous for each state. Together, these advantages make ShieldNN uniquely suited as an efficient Multi-Obstacle Safe Actions (i.e. multiple-barrier-function shielding) during training time of a Reinforcement Learning (RL) enabled Neural Network controller. We show via experiments that ShieldNN dramatically increases the completion rate of RL training episodes in the presence of multiple obstacles, thus establishing the value of ShieldNN in training RL-based controllers.