在线受限的基于模型的强化学习

论文标题

在线受限的基于模型的强化学习

Online Constrained Model-based Reinforcement Learning

论文作者

van Niekerk, Benjamin, Damianou, Andreas, Rosman, Benjamin

论文摘要

将强化学习应用于机器人系统会带来许多具有挑战性的问题。一个关键要求是能够在有限的时间和资源预算之内处理连续状态和行动空间。此外，为了安全操作，系统必须在严格的约束下做出强大的决策。为了应对这些挑战，我们提出了一种基于模型的方法，该方法结合了高斯过程回归和退化的地平线控制。使用稀疏的光谱高斯过程，我们通过从一系列感觉数据流逐渐更新动力学模型来扩展先前的工作。这导致可以在非线性约束下实时学习和计划的代理。我们在推车杆摇摆环境中测试了我们的方法，并证明了在线学习对自动赛车任务的好处。环境的动态是从有限的培训数据中学到的，可以在新任务实例中重复使用而无需再培训。

Applying reinforcement learning to robotic systems poses a number of challenging problems. A key requirement is the ability to handle continuous state and action spaces while remaining within a limited time and resource budget. Additionally, for safe operation, the system must make robust decisions under hard constraints. To address these challenges, we propose a model based approach that combines Gaussian Process regression and Receding Horizon Control. Using sparse spectrum Gaussian Processes, we extend previous work by updating the dynamics model incrementally from a stream of sensory data. This results in an agent that can learn and plan in real-time under non-linear constraints. We test our approach on a cart pole swing-up environment and demonstrate the benefits of online learning on an autonomous racing task. The environment's dynamics are learned from limited training data and can be reused in new task instances without retraining.

下载PDF全文

下载文献需遵守相关版权规定

论文标题