深度神经网络的最佳时间变量学习框架

论文标题

深度神经网络的最佳时间变量学习框架

An Optimal Time Variable Learning Framework for Deep Neural Networks

论文作者

Antil, Harbir, Díaz, Hugo, Herberg, Evelyn

论文摘要

深神经网络（DNN）中的特征传播可能与非线性离散动力系统有关。在本文中，新颖性在于让离散的参数（时间步长）因层而异，需要在优化框架中学习。提出的框架可以应用于任何现有网络，例如ResNet，Densenet或分数-DNN。该框架被证明有助于克服消失和爆炸的梯度问题。还研究了某些现有连续DNN（例如分数-DNN）的稳定性。所提出的方法应用于一个不适合的3D-Maxwell方程。

Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. This framework is shown to help overcome the vanishing and exploding gradient issues. Stability of some of the existing continuous DNNs such as Fractional-DNN is also studied. The proposed approach is applied to an ill-posed 3D-Maxwell's equation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题