约束线性系统的近似动态编程：分段二次近似方法

论文标题

约束线性系统的近似动态编程：分段二次近似方法

Approximate Dynamic Programming for Constrained Linear Systems: A Piecewise Quadratic Approximation Approach

论文作者

He, Kanghui, Shi, Shengling, Boom, Ton van den, De Schutter, Bart

论文摘要

近似动态编程（ADP）在处理控制问题中的约束方面面临挑战。相比之下，模型预测控制（MPC）以其约束和稳定性保证而闻名，尽管其计算有时是过度的。本文介绍了一种结合两种方法来克服其个人局限性的方法。在价值函数是分段二次时，已证明了约束线性二次调节（CLQR）问题的预测控制定律（CLQR）问题是分段仿射（PWA）。我们利用MPC的这些正式结果来设计用于CLQR问题的ADP方法。提出了一个具有局部全球架构的新型凸和分段二次神经网络，以提供对价值函数的准确近似，该函数用作在线动态编程问题中的成本到GO函数。开发了一种有效的分解算法来加快在线计算。在实现良好的值函数近似的条件下，对拟议的控制方案进行了闭环系统的严格稳定性分析。进行比较模拟以证明在线计算和最佳性方面提出的方法的潜力。

Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies to overcome their individual limitations. The predictive control law for constrained linear quadratic regulation (CLQR) problems has been proven to be piecewise affine (PWA) while the value function is piecewise quadratic. We exploit these formal results from MPC to design an ADP method for CLQR problems. A novel convex and piecewise quadratic neural network with a local-global architecture is proposed to provide an accurate approximation of the value function, which is used as the cost-to-go function in the online dynamic programming problem. An efficient decomposition algorithm is developed to speed up the online computation. Rigorous stability analysis of the closed-loop system is conducted for the proposed control scheme under the condition that a good approximation of the value function is achieved. Comparative simulations are carried out to demonstrate the potential of the proposed method in terms of online computation and optimality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题