论文标题

不精确的近端梯度方法,具有线路搜索,并降低了仿射约束和双线性鞍点结构化凸问题的复杂性

Inexact accelerated proximal gradient method with line search and reduced complexity for affine-constrained and bilinear saddle-point structured convex problems

论文作者

Lin, Qihang, Xu, Yangyang

论文摘要

本文的目的是减少两类问题的基于梯度方法的总复杂性:仿射约束的复合凸优化和双线性鞍点结构化的非平滑凸优化。我们的技术基于双环不精确的加速近端梯度(APG)方法,用于最大程度地减少非平滑但可近距离的凸功能的总和,以及两个具有不同平滑度常数和计算成本的平滑凸功能。与标准APG方法相比,如果一种平滑的组件具有更高的计算成本,但平稳性较小,则不精确的APG方法可以降低总计算成本。借助此属性,可以应用不精确的APG方法来近似求解近端增强的拉格朗日方法的子问题,用于仿射约束的复合凸优化和双线性鞍点结构性的非滑动非平滑凸优化的光滑近似值,具有较小的光滑度具有更高的计算性成本,具有较小的平滑性成本。因此,它可以降低查找大约最佳/固定解决方案的总复杂性。该技术类似于文献中的梯度滑动技术。不同之处在于,我们的不精确APG方法可以通过基于违反平稳性的计算条件来有效地停止内部循环,而梯度滑动方法则需要预先指定内部循环的迭代次数。数值实验表明,在最佳原始双偶一阶方法和梯度滑动方法上,我们的方法的效率明显更高。

The goal of this paper is to reduce the total complexity of gradient-based methods for two classes of problems: affine-constrained composite convex optimization and bilinear saddle-point structured non-smooth convex optimization. Our technique is based on a double-loop inexact accelerated proximal gradient (APG) method for minimizing the summation of a non-smooth but proximable convex function and two smooth convex functions with different smoothness constants and computational costs. Compared to the standard APG method, the inexact APG method can reduce the total computation cost if one smooth component has higher computational cost but a smaller smoothness constant than the other. With this property, the inexact APG method can be applied to approximately solve the subproblems of a proximal augmented Lagrangian method for affine-constrained composite convex optimization and the smooth approximation for bilinear saddle-point structured non-smooth convex optimization, where the smooth function with a smaller smoothness constant has significantly higher computational cost. Thus it can reduce total complexity for finding an approximately optimal/stationary solution. This technique is similar to the gradient sliding technique in the literature. The difference is that our inexact APG method can efficiently stop the inner loop by using a computable condition based on a measure of stationarity violation, while the gradient sliding methods need to pre-specify the number of iterations for the inner loop. Numerical experiments demonstrate significantly higher efficiency of our methods over an optimal primal-dual first-order method and the gradient sliding methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源