节能控制适应，并提供安全保证，以确保以学习为基础的网络物理系统

论文标题

节能控制适应，并提供安全保证，以确保以学习为基础的网络物理系统

Energy-Efficient Control Adaptation with Safety Guarantees for Learning-Enabled Cyber-Physical Systems

论文作者

Wang, Yixuan, Huang, Chao, Zhu, Qi

论文摘要

神经网络已越来越多地用于控制支持学习的网络物理系统（LE-CPS），并在提高系统性能和效率方面表现出了巨大的希望，并减少了对复杂物理模型的需求。但是，这种基于神经网络的控制器缺乏安全保证，严重阻碍了他们在安全至关重要的CPS中的采用。在这项工作中，我们提出了一种控制器适应方法，该方法会自动在包括神经网络控制器在内的多个控制器之间切换，以确保系统安全并提高能源效率。我们的方法包括基于正式方法和机器学习的两个关键组件。首先，我们在有界干扰下使用基于伯恩斯坦 - 多项式的混合系统模型近似每个控制器，并根据其相应的混合系统为每个控制器计算一个安全的不变设置。直观地，控制器的不变集定义了系统在其控制下始终保持安全的状态空间。然后，控制器的不变式设置的结合定义了一个安全的适应空间，该空间大于（或等于）每个控制器的安全适应空间。其次，我们开发了一种深入的增强学习方法，以学习减少控制/驱动能源成本的控制器切换策略，同时借助安全保护措施规则，以确保系统停留在安全空间内。在线性自适应巡航控制系统和非线性范德波尔的振荡器上进行的实验证明了我们方法对节能和安全增强的有效性。

Neural networks have been increasingly applied for control in learning-enabled cyber-physical systems (LE-CPSs) and demonstrated great promises in improving system performance and efficiency, as well as reducing the need for complex physical models. However, the lack of safety guarantees for such neural network based controllers has significantly impeded their adoption in safety-critical CPSs. In this work, we propose a controller adaptation approach that automatically switches among multiple controllers, including neural network controllers, to guarantee system safety and improve energy efficiency. Our approach includes two key components based on formal methods and machine learning. First, we approximate each controller with a Bernstein-polynomial based hybrid system model under bounded disturbance, and compute a safe invariant set for each controller based on its corresponding hybrid system. Intuitively, the invariant set of a controller defines the state space where the system can always remain safe under its control. The union of the controllers' invariants sets then define a safe adaptation space that is larger than (or equal to) that of each controller. Second, we develop a deep reinforcement learning method to learn a controller switching strategy for reducing the control/actuation energy cost, while with the help of a safety guard rule, ensuring that the system stays within the safe space. Experiments on a linear adaptive cruise control system and a non-linear Van der Pol's oscillator demonstrate the effectiveness of our approach on energy saving and safety enhancement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题