论文标题
退出时间分析,用于跨鞍点附近的梯度下降轨迹的近似值
Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points
论文作者
论文摘要
本文考虑了了解在某些初始边界条件下从马鞍域中使用与梯度相关的一阶方法的轨迹的退出时间的问题。鉴于围绕鞍点的“平坦”几何形状,由于遇到的梯度的幅度很小,一阶方法可能会以快速方式逃脱这些区域。特别是,虽然已知与梯度相关的一阶方法逃脱了严格的陷阱社区,但现有的分析技术并不能明确利用围绕鞍点的局部几何形状来控制梯度轨迹的行为。正是在这种情况下,本文使用矩阵扰动理论对围绕严格陷阱邻里的梯度变态方法进行了严格的几何分析。在此过程中,它提供了一个关键结果,可用于在任何给定的初始条件下生成近似梯度轨迹。此外,该分析在某些必要的初始条件下导致了梯度降低方法的线性退出时间解决方案,该解决方案明确地阐明了对问题维度,鞍座邻居的调理以及更多的一类严格助攻功能的依赖性。
This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the 'flat' geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.