论文标题

通过交替的最小化方法,训练深神经网络的收敛速率

Convergence Rates of Training Deep Neural Networks via Alternating Minimization Methods

论文作者

Xu, Jintao, Bao, Chenglong, Xing, Wenxun

论文摘要

训练深神经网络(DNNS)是机器学习中的一个重要且具有挑战性的优化问题,由于其非凸度和不可分割的结构。交替的最小化方法(AM)方法分解了DNN的组成结构,并引起了深度学习和优化社区的极大兴趣。在本文中,我们提出了一个统一的框架,用于分析AM型网络培训方法的收敛速率。我们的分析基于非主持酮$ j $步骤的足够减少条件和Kurdyka-lojasiewicz(KL)属性,该属性放松了设计下降算法的要求。如果KL指数$θ$在$ [0,1)$方面显示详细的本地收敛速率。此外,在较强的$ j $步骤中讨论了本地R线性收敛。

Training deep neural networks (DNNs) is an important and challenging optimization problem in machine learning due to its non-convexity and non-separable structure. The alternating minimization (AM) approaches split the composition structure of DNNs and have drawn great interest in the deep learning and optimization communities. In this paper, we propose a unified framework for analyzing the convergence rate of AM-type network training methods. Our analysis is based on the non-monotone $j$-step sufficient decrease conditions and the Kurdyka-Lojasiewicz (KL) property, which relaxes the requirement of designing descent algorithms. We show the detailed local convergence rate if the KL exponent $θ$ varies in $[0,1)$. Moreover, the local R-linear convergence is discussed under a stronger $j$-step sufficient decrease condition.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源