转移学习以进行有效的迭代安全验证

论文标题

转移学习以进行有效的迭代安全验证

Transfer Learning for Efficient Iterative Safety Validation

论文作者

Corso, Anthony, Kochenderfer, Mykel J.

论文摘要

安全验证在开发安全至关重要的自治系统过程中很重要，但可能需要大量的计算工作。现有的算法通常每次被测试的系统更改时都会从头开始。我们应用转移学习来提高基于增强学习的安全验证算法的效率，当应用于相关系统。从以前的安全验证任务中进行的知识通过动作值函数进行编码，并通过一组学习的注意力转移到将来的任务。即使系统具有实质不同的故障模式，包括每个源任务的学习状态和行动价值转换也可以提高性能。我们在网格世界和自动驾驶场景中进行有关安全验证任务的实验。我们表明，转移学习可以改善验证算法的初始和最终性能，并减少培训步骤的数量。

Safety validation is important during the development of safety-critical autonomous systems but can require significant computational effort. Existing algorithms often start from scratch each time the system under test changes. We apply transfer learning to improve the efficiency of reinforcement learning based safety validation algorithms when applied to related systems. Knowledge from previous safety validation tasks is encoded through the action value function and transferred to future tasks with a learned set of attention weights. Including a learned state and action value transformation for each source task can improve performance even when systems have substantially different failure modes. We conduct experiments on safety validation tasks in gridworld and autonomous driving scenarios. We show that transfer learning can improve the initial and final performance of validation algorithms and reduce the number of training steps.

下载PDF全文

下载文献需遵守相关版权规定

论文标题