在凸优化方法中利用高级导数

论文标题

在凸优化方法中利用高级导数

Exploiting higher-order derivatives in convex optimization methods

论文作者

Kamzolov, Dmitry, Gasnikov, Alexander, Dvurechensky, Pavel, Agafonov, Artem, Takáč, Martin

论文摘要

至少自1970年代以来，在凸优化中利用高阶衍生物是众所周知的。在每种迭代中，高阶（也称为张量）方法最小化了目标函数的正则泰勒膨胀，如果相应的高阶导数是lipschitz的连续，则会导致更快的收敛速率。最近，证明了此类方法的一系列较低的迭代复杂性界限，并揭示了较低的复杂性边界之间的差距。此外，由于可以在凸功能的适当正则化的泰勒扩展，因此可以在多项式时间最小化，因此可以实现此类方法。直到最近，提出了一种具有最佳收敛速率$ 1/k^{（3p+1）/2} $的算法，以最大程度地减少Lipschitz $ p $ -th派生的凸功能。对于Lipschitz第三个衍生品的凸功能，这些开发允许提出一种二阶方法，其收敛速率$ 1/k^5 $，该方法比现有二阶方法的$ 1/k^{3.5} $更快。

Exploiting higher-order derivatives in convex optimization is known at least since 1970's. In each iteration higher-order (also called tensor) methods minimize a regularized Taylor expansion of the objective function, which leads to faster convergence rates if the corresponding higher-order derivative is Lipschitz-continuous. Recently a series of lower iteration complexity bounds for such methods were proved, and a gap between upper an lower complexity bounds was revealed. Moreover, it was shown that such methods can be implementable since the appropriately regularized Taylor expansion of a convex function is also convex and, thus, can be minimized in polynomial time. Only very recently an algorithm with optimal convergence rate $1/k^{(3p+1)/2}$ was proposed for minimizing convex functions with Lipschitz $p$-th derivative. For convex functions with Lipschitz third derivative, these developments allowed to propose a second-order method with convergence rate $1/k^5$, which is faster than the rate $1/k^{3.5}$ of existing second-order methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题