论文标题
重新审视约束差分动态编程
Constrained Differential Dynamic Programming Revisited
论文作者
论文摘要
差异动态编程(DDP)已成为一种已建立的无约束轨迹优化的方法。尽管它在机器人技术和控件中有多种应用,但是该算法的广泛限制版本尚未开发。本文以惩罚方法和主动设定方法为基础,以设计基于动态编程的方法来进行约束最佳控制。关于前者,我们的派生使用贝尔曼的最佳原理采用了约束版本,通过在向后传球中引入一组辅助松弛变量。同时,我们通过利用一组特定的罚款 - 拉格朗日函数来保留二阶可不同性。我们通过实验证明,我们的扩展(单独和组合)显着增强了算法的收敛性能,并且在大量模拟场景上表现出色。
Differential Dynamic Programming (DDP) has become a well established method for unconstrained trajectory optimization. Despite its several applications in robotics and controls however, a widely successful constrained version of the algorithm has yet to be developed. This paper builds upon penalty methods and active-set approaches, towards designing a Dynamic Programming-based methodology for constrained optimal control. Regarding the former, our derivation employs a constrained version of Bellman's principle of optimality, by introducing a set of auxiliary slack variables in the backward pass. In parallel, we show how Augmented Lagrangian methods can be naturally incorporated within DDP, by utilizing a particular set of penalty-Lagrangian functions that preserve second-order differentiability. We demonstrate experimentally that our extensions (individually and combinations thereof) enhance significantly the convergence properties of the algorithm, and outperform previous approaches on a large number of simulated scenarios.