重新审视约束差分动态编程

论文标题

重新审视约束差分动态编程

Constrained Differential Dynamic Programming Revisited

论文作者

Aoyama, Yuichiro, Boutselis, George, Patel, Akash, Theodorou, Evangelos A.

论文摘要

差异动态编程（DDP）已成为一种已建立的无约束轨迹优化的方法。尽管它在机器人技术和控件中有多种应用，但是该算法的广泛限制版本尚未开发。本文以惩罚方法和主动设定方法为基础，以设计基于动态编程的方法来进行约束最佳控制。关于前者，我们的派生使用贝尔曼的最佳原理采用了约束版本，通过在向后传球中引入一组辅助松弛变量。同时，我们通过利用一组特定的罚款 - 拉格朗日函数来保留二阶可不同性。我们通过实验证明，我们的扩展（单独和组合）显着增强了算法的收敛性能，并且在大量模拟场景上表现出色。

Differential Dynamic Programming (DDP) has become a well established method for unconstrained trajectory optimization. Despite its several applications in robotics and controls however, a widely successful constrained version of the algorithm has yet to be developed. This paper builds upon penalty methods and active-set approaches, towards designing a Dynamic Programming-based methodology for constrained optimal control. Regarding the former, our derivation employs a constrained version of Bellman's principle of optimality, by introducing a set of auxiliary slack variables in the backward pass. In parallel, we show how Augmented Lagrangian methods can be naturally incorporated within DDP, by utilizing a particular set of penalty-Lagrangian functions that preserve second-order differentiability. We demonstrate experimentally that our extensions (individually and combinations thereof) enhance significantly the convergence properties of the algorithm, and outperform previous approaches on a large number of simulated scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题