论文标题
与非凸正规化器的临界点正则化的收敛分析
Convergence analysis of critical point regularization with non-convex regularizers
论文作者
论文摘要
稳定性和变异正则化分析的关键假设之一是找到全局最小化器的能力。 但是,当常规化器是黑匣子或非凸件时,这种假设通常是不可行的,这使得搜索所涉及的Tikhonov功能性的全球最小化器是一个具有挑战性的任务。尤其是由神经网络定义的新兴正规化器类别的情况。取而代之的是,应用标准最小化方案,通常仅保证找到一个临界点。为了解决这个问题,在本文中,我们研究了具有可能的非凸正规剂的Tikhonov功能关键点的稳定性和收敛性。为此,我们介绍了相对亚差异性的概念,并研究了其基本属性。基于这个概念,我们开发了一个融合分析,假设正规化器的相对亚分化性。拟议概念背后的理由是,蒂科诺夫函数的临界点也是相对关键点,对于后者,可以开发收敛理论。对于噪声水平趋于零的情况,我们得出了一个限制问题,代表相关限制优化问题的一阶最佳条件。除此之外,我们还与经典方法进行了比较,并表明Relu-Networks类是正则化功能的适当选择。最后,我们提供了支持我们的理论发现以及本文提供的分析的需求。
One of the key assumptions in the stability and convergence analysis of variational regularization is the ability of finding global minimizers. However, such an assumption is often not feasible when the regularizer is a black box or non-convex making the search for global minimizers of the involved Tikhonov functional a challenging task. This is in particular the case for the emerging class of learned regularizers defined by neural networks. Instead, standard minimization schemes are applied which typically only guarantee that a critical point is found. To address this issue, in this paper we study stability and convergence properties of critical points of Tikhonov functionals with a possible non-convex regularizer. To this end, we introduce the concept of relative sub-differentiability and study its basic properties. Based on this concept, we develop a convergence analysis assuming relative sub-differentiability of the regularizer. The rationale behind the proposed concept is that critical points of the Tikhonov functional are also relative critical points and that for the latter a convergence theory can be developed. For the case where the noise level tends to zero, we derive a limiting problem representing first-order optimality conditions of a related restricted optimization problem. Besides this, we also give a comparison with classical methods and show that the class of ReLU-networks are appropriate choices for the regularization functional. Finally, we provide numerical simulations that support our theoretical findings and the need for the sort of analysis that we provide in this paper.