结合非线性控制的基于模型的和无模型的方法：一种可证明的收敛政策梯度方法

论文标题

结合非线性控制的基于模型的和无模型的方法：一种可证明的收敛政策梯度方法

Combining Model-Based and Model-Free Methods for Nonlinear Control: A Provably Convergent Policy Gradient Approach

论文作者

Qu, Guannan, Yu, Chenkai, Low, Steven, Wierman, Adam

论文摘要

基于无模型的控制方法最近取得了巨大的成功。但是，这种方法通常会遭受样本复杂性较差和有限的收敛保证。这与基于经典模型的控制形成鲜明对比，后者具有丰富的理论，但通常需要强大的建模假设。在本文中，我们结合了两种方法，以实现两全其美的方法。我们考虑具有线性和非线性组件的动力系统，并开发了一种新颖的方法来使用线性模型来定义无模型的策略梯度方法的温暖起点。我们显示，这种混合方法的表现优于基于模型的控制器，同时避免了通过数值实验和理论分析与无模型方法相关的收敛问题，在这种实验和理论分析中，我们在非线性组件上得出了足够的条件，以确保我们的方法能够收敛到（几乎）全局最佳控制器。

Model-free learning-based control methods have seen great success recently. However, such methods typically suffer from poor sample complexity and limited convergence guarantees. This is in sharp contrast to classical model-based control, which has a rich theory but typically requires strong modeling assumptions. In this paper, we combine the two approaches to achieve the best of both worlds. We consider a dynamical system with both linear and non-linear components and develop a novel approach to use the linear model to define a warm start for a model-free, policy gradient method. We show this hybrid approach outperforms the model-based controller while avoiding the convergence issues associated with model-free approaches via both numerical experiments and theoretical analyses, in which we derive sufficient conditions on the non-linear component such that our approach is guaranteed to converge to the (nearly) global optimal controller.

下载PDF全文

下载文献需遵守相关版权规定

论文标题