论文标题
辍学的隐式和明确正规化效果
The Implicit and Explicit Regularization Effects of Dropout
论文作者
论文摘要
辍学是一种广泛使用的正规化技术,通常需要获得许多架构的最新技术。这项工作表明,辍学引入了两个不同但纠缠的正则化效果:明确的效果(也在先前的工作中研究),这是因为辍学会修改预期的训练目标,并且可能令人惊讶地,在辍学训练更新中的随机性产生了额外的隐含效果。这种隐式正则化作用类似于小型迷你批量随机梯度下降中随机性的影响。我们通过受控实验解散这两种效果。然后,我们得出了分析简化,这些简化以模型的衍生物和损失为代表每个效果,以供深度神经网络。我们证明了这些简化的,分析的正规化器准确地捕获了辍学的重要方面,这表明他们在实践中忠实地取代了辍学。
Dropout is a widely-used regularization technique, often required to obtain state-of-the-art for a number of architectures. This work demonstrates that dropout introduces two distinct but entangled regularization effects: an explicit effect (also studied in prior work) which occurs since dropout modifies the expected training objective, and, perhaps surprisingly, an additional implicit effect from the stochasticity in the dropout training update. This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent. We disentangle these two effects through controlled experiments. We then derive analytic simplifications which characterize each effect in terms of the derivatives of the model and the loss, for deep neural networks. We demonstrate these simplified, analytic regularizers accurately capture the important aspects of dropout, showing they faithfully replace dropout in practice.