通过扰动的riemannian随机递归梯度，逃脱鞍座点更快

论文标题

通过扰动的riemannian随机递归梯度，逃脱鞍座点更快

Escape saddle points faster on manifolds via perturbed Riemannian stochastic recursive gradient

论文作者

Han, Andi, Gao, Junbin

论文摘要

在本文中，我们提出了Riemannian随机递归梯度方法的一种变体，该方法可以使用简单的扰动来实现二阶收敛保证并逃脱鞍点。这个想法是在梯度很小时扰动迭代，并在切线空间上进行随机递归梯度更新。这避免了利用Riemannian几何形状的并发症。我们表明，在有限和设置下，我们的算法需要$ \ widetilde {\ Mathcal {o}}} \ big（\ frac {\ sqrt {\ sqrt {n}} {ε^2} + \ freac {\ frac { \ frac {n} {δ^3} \ big）$随机梯度查询以找到$（ε，δ）$ - 二阶关键点。这严格改善了扰动的Riemannian梯度下降的复杂性，并且在大样本设置下优于扰动的Riemannian加速梯度下降。我们还提供了$ \ widetilde {\ Mathcal {o}} \ big（\ frac {1} {ε^3} + \ frac {1} {Δ^3ε^2} + \ frac + \ frac {1} {1} {1} {δ^4 rip of Optimane的$ ins-enterians，这是在线上，这是在线上，以列表，这是在线上，这是在线上，这是在线上，以录制在线上，以命中列表仅使用一阶信息收敛。

In this paper, we propose a variant of Riemannian stochastic recursive gradient method that can achieve second-order convergence guarantee and escape saddle points using simple perturbation. The idea is to perturb the iterates when gradient is small and carry out stochastic recursive gradient updates over tangent space. This avoids the complication of exploiting Riemannian geometry. We show that under finite-sum setting, our algorithm requires $\widetilde{\mathcal{O}}\big( \frac{ \sqrt{n}}{ε^2} + \frac{\sqrt{n} }{δ^4} + \frac{n}{δ^3}\big)$ stochastic gradient queries to find a $(ε, δ)$-second-order critical point. This strictly improves the complexity of perturbed Riemannian gradient descent and is superior to perturbed Riemannian accelerated gradient descent under large-sample settings. We also provide a complexity of $\widetilde{\mathcal{O}} \big( \frac{1}{ε^3} + \frac{1}{δ^3 ε^2} + \frac{1}{δ^4 ε} \big)$ for online optimization, which is novel on Riemannian manifold in terms of second-order convergence using only first-order information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题