为什么无监督的深网概括

论文标题

为什么无监督的深网概括

Why Unsupervised Deep Networks Generalize

论文作者

Koch, Anita de Mello, Koch, Ellen de Mello, Koch, Robert de Mello

论文摘要

有希望的概括难题的决议观察到，深网络中的实际参数数量远小于幼稚的估计。重新归一化组是一个令人信服的问题的示例，尽管幼稚的估计值则暗示了其他参数，但其参数很少。我们的核心假设是，重新归一化组背后的机制也在深度学习中起作用，这导致了概括难题的解决。我们通过表明训练有素的RBM丢弃了高动量模式，显示了详细的定量证据，证明了RBM的假设。我们主要关注自动编码器，我们给出一种算法，直接从学习数据集中确定网络的参数。由此产生的自动编码器几乎和经过深度学习训练的人几乎具有性能，并且为我们考虑的实验提供了极好的初始条件，将训练时间减少了4至100的因素。此外，我们能够建议一个简单的标准，以决定使用深层网络是否可以解决给定问题。

Promising resolutions of the generalization puzzle observe that the actual number of parameters in a deep network is much smaller than naive estimates suggest. The renormalization group is a compelling example of a problem which has very few parameters, despite the fact that naive estimates suggest otherwise. Our central hypothesis is that the mechanisms behind the renormalization group are also at work in deep learning, and that this leads to a resolution of the generalization puzzle. We show detailed quantitative evidence that proves the hypothesis for an RBM, by showing that the trained RBM is discarding high momentum modes. Specializing attention mainly to autoencoders, we give an algorithm to determine the network's parameters directly from the learning data set. The resulting autoencoder almost performs as well as one trained by deep learning, and it provides an excellent initial condition for training, reducing training times by a factor between 4 and 100 for the experiments we considered. Further, we are able to suggest a simple criterion to decide if a given problem can or can not be solved using a deep network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题