论文标题
正规化非凸和非平滑双级优化的快速且收敛的近端算法
A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization
论文作者
论文摘要
许多重要的机器学习应用程序涉及正则非凸双级优化。但是,现有的基于梯度的双层优化算法无法处理非凸或非正规化器,并且它们在非convex Bi级优化中具有很高的计算复杂性。在这项工作中,我们研究了一种近端梯度型算法,该算法采用了近似隐式分化(AID)方案,用于非凸双级优化,可能是非凸和非平滑的正规化器。特别是,该算法应用Nesterov的动量来加速涉及辅助的隐式梯度的计算。我们通过识别其内在潜在函数,对该算法的全球收敛性质进行了全面分析。特别是,我们正式建立了模型参数到双层问题的临界点的收敛性,并获得了改进的计算复杂性$ \ MATHCAL {O}(κ^{3.5}ε^{ - 2})$,超出了目前的结果。此外,我们分析了该算法的渐近收敛速率,该算法是以lojasiewicz-type梯度不平等为特征的一类局部非凸几何形状。高参数优化的实验证明了我们算法的有效性。
Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit differentiation (AID) scheme for nonconvex bi-level optimization with possibly nonconvex and nonsmooth regularizers. In particular, the algorithm applies the Nesterov's momentum to accelerate the computation of the implicit gradient involved in AID. We provide a comprehensive analysis of the global convergence properties of this algorithm through identifying its intrinsic potential function. In particular, we formally establish the convergence of the model parameters to a critical point of the bi-level problem, and obtain an improved computation complexity $\mathcal{O}(κ^{3.5}ε^{-2})$ over the state-of-the-art result. Moreover, we analyze the asymptotic convergence rates of this algorithm under a class of local nonconvex geometries characterized by a Łojasiewicz-type gradient inequality. Experiment on hyper-parameter optimization demonstrates the effectiveness of our algorithm.