Langevin算法非常深的神经网络，并应用于图像分类

论文标题

Langevin算法非常深的神经网络，并应用于图像分类

Langevin algorithms for very deep Neural Networks with application to image classification

论文作者

Bras, Pierre

论文摘要

训练非常深的神经网络是一项具有挑战性的任务，因为神经网络的深度越深，它的非线性就越多。我们将各种预处理的Langevin算法的性能与其非兰格文素的表演进行了比较，以训练深度增加的神经网络。对于浅的神经网络，Langevin算法不会带来任何改进，但是网络越深，Langevin算法提供的增长越大。在梯度下降中增加噪声可以从局部陷阱中逃脱，对于非常深的神经网络而言，这更频繁。遵循这种启发式，我们引入了一种称为Layer Langevin的新的Langevin算法，该算法仅在与最深层相关的重量中添加Langevin噪声。然后，我们证明了Langevin和Lays Langevin算法的好处，用于培训流行的深层残留体系结构进行图像分类。

Training a very deep neural network is a challenging task, as the deeper a neural network is, the more non-linear it is. We compare the performances of various preconditioned Langevin algorithms with their non-Langevin counterparts for the training of neural networks of increasing depth. For shallow neural networks, Langevin algorithms do not lead to any improvement, however the deeper the network is and the greater are the gains provided by Langevin algorithms. Adding noise to the gradient descent allows to escape from local traps, which are more frequent for very deep neural networks. Following this heuristic we introduce a new Langevin algorithm called Layer Langevin, which consists in adding Langevin noise only to the weights associated to the deepest layers. We then prove the benefits of Langevin and Layer Langevin algorithms for the training of popular deep residual architectures for image classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题