论文标题
可区分的前进和向后定点迭代层
Differentiable Forward and Backward Fixed-Point Iteration Layers
论文作者
论文摘要
最近,一些研究提出了在设计深神经网络时利用一些优化问题来编码常规层无法捕获的约束的方法。但是,这些方法仍处于起步阶段,需要特殊治疗,例如分析KKT条件,以推导反向传播公式。在本文中,我们提出了一种称为定点迭代(FPI)层的新层配方,该公式有助于在深网中使用更复杂的操作。还提出了向后的FPI层以进行反向传播,这是由反复反向传播(RBP)算法的动机。但是与RBP相反,向后的FPI层通过一个小型网络模块产生梯度,而无需明确计算雅各布。在实际应用中,可以将前向和向后FPI层都视为计算图中的节点。提出的方法中的所有组件均以高级抽象的水平实现,这允许在节点上有效地进行高阶分化。此外,我们还提出了FPI层FPI_NN和FPI_GD的两种实用方法,其中FPI的更新操作分别是一个小的神经网络模块和一个基于可学习成本功能的单个梯度下降步骤。 FPI \ _nn是直观的,简单的,快速训练,而FPI_GD可用于有效培训最近已研究的能量网络。尽管RBP及其相关研究尚未应用于实际示例,但我们的实验表明,FPI层可以成功地应用于现实世界中的问题,例如图像DeNoising,光流和多标签分类。
Recently, several studies proposed methods to utilize some classes of optimization problems in designing deep neural networks to encode constraints that conventional layers cannot capture. However, these methods are still in their infancy and require special treatments, such as analyzing the KKT condition, for deriving the backpropagation formula. In this paper, we propose a new layer formulation called the fixed-point iteration (FPI) layer that facilitates the use of more complicated operations in deep networks. The backward FPI layer is also proposed for backpropagation, which is motivated by the recurrent back-propagation (RBP) algorithm. But in contrast to RBP, the backward FPI layer yields the gradient by a small network module without an explicit calculation of the Jacobian. In actual applications, both the forward and backward FPI layers can be treated as nodes in the computational graphs. All components in the proposed method are implemented at a high level of abstraction, which allows efficient higher-order differentiations on the nodes. In addition, we present two practical methods of the FPI layer, FPI_NN and FPI_GD, where the update operations of FPI are a small neural network module and a single gradient descent step based on a learnable cost function, respectively. FPI\_NN is intuitive, simple, and fast to train, while FPI_GD can be used for efficient training of energy networks that have been recently studied. While RBP and its related studies have not been applied to practical examples, our experiments show the FPI layer can be successfully applied to real-world problems such as image denoising, optical flow, and multi-label classification.