论文标题
$ r $ $ - 线性收敛的注释非单身梯度方法
A note on $R$-linear convergence of nonmonotone gradient methods
论文作者
论文摘要
非单身梯度方法的性能通常比单调的梯度更好,尤其是在不受约束的二次优化方面。但是,单调方法的已知收敛速率通常比其非单调变体好得多。为了缩小非单身梯度方法的理论和实践之间的差距,我们引入了一种属性,用于收集大量梯度方法。我们证明,使用满足该属性的任何梯度方法以$ 1-λ_1/m_1 $的速率收敛$ r $ - 其中$λ_1$是Hessian矩阵的最小特征值,而$ M_1 $是逆步的上限。我们的结果表明,许多非单身方法的现有收敛速率可以提高到$ 1-1/κ$,$κ$是相关的条件数。
Nonmonotone gradient methods generally perform better than their monotone counterparts especially on unconstrained quadratic optimization. However, the known convergence rate of the monotone method is often much better than its nonmonotone variant. With the aim of shrinking the gap between theory and practice of nonmonotone gradient methods, we introduce a property for convergence analysis of a large collection of gradient methods. We prove that any gradient method using stepsizes satisfying the property will converge $R$-linearly at a rate of $1-λ_1/M_1$, where $λ_1$ is the smallest eigenvalue of Hessian matrix and $M_1$ is the upper bound of the inverse stepsize. Our results indicate that the existing convergence rates of many nonmonotone methods can be improved to $1-1/κ$ with $κ$ being the associated condition number.