论文标题

简单神经网络中的复杂动力学:了解相位检索中的梯度流

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

论文作者

Mannelli, Stefano Sarao, Biroli, Giulio, Cammarota, Chiara, Krzakala, Florent, Urbani, Pierfrancesco, Zdeborová, Lenka

论文摘要

尽管广泛使用基于梯度的算法来优化高维非凸功能,但了解它们找到良好的最小值而不是被杀死的算法的能力在很大程度上仍然是一个开放的问题。在这里,我们专注于从随机测量中进行相位检索的梯度流动动力学。当测量数与输入维数的比率很小时,动力学仍然被困在带有大盆地吸引力盆地的伪造极小中。我们分析发现,这些临界点要高于临界比率,这些临界点变得不稳定,向信号开发负方向。通过数值实验,我们表明,在此制度中,梯度流量算法并未被捕获。它沿着不稳定的方向偏离了虚假的关键点,并成功地找到了全球最低限度。使用统计物理学的工具,我们表征了这种现象,该现象与伪造极小的Hessian中的BBP型过渡有关。

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem. Here we focus on gradient flow dynamics for phase retrieval from random measurements. When the ratio of the number of measurements over the input dimension is small the dynamics remains trapped in spurious minima with large basins of attraction. We find analytically that above a critical ratio those critical points become unstable developing a negative direction toward the signal. By numerical experiments we show that in this regime the gradient flow algorithm is not trapped; it drifts away from the spurious critical points along the unstable direction and succeeds in finding the global minimum. Using tools from statistical physics we characterize this phenomenon, which is related to a BBP-type transition in the Hessian of the spurious minima.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源