论文标题
深度残留神经网络中的不确定性定量
Uncertainty Quantification in Deep Residual Neural Networks
论文作者
论文摘要
不确定性量化是深度学习中的重要且具有挑战性的问题。以前的方法依赖于对批量大小敏感的现代深度体系结构或批处理归一化中不存在的辍学层。在这项工作中,我们使用称为随机深度的正则化技术解决了深残留网络中不确定性定量的问题。我们表明,使用随机深度的训练残留网络可以解释为贝叶斯神经网络重量上棘手的后验的变异近似。我们证明,通过从具有不同深度和共享权重的残留网络的分布中取样,可以获得有意义的不确定性估计。此外,与剩余网络的原始公式相比,我们的方法产生了良好的软磁性概率,仅对网络结构进行了较小的更改。我们在流行的计算机视觉数据集上评估我们的方法,并测量不确定性估计的质量。我们还测试了域移位的鲁棒性,并表明我们的方法能够表达出更高的预测性不确定性对分布样品的不确定性。最后,我们证明了如何使用所提出的方法来获得面部验证应用中的不确定性估计。
Uncertainty quantification is an important and challenging problem in deep learning. Previous methods rely on dropout layers which are not present in modern deep architectures or batch normalization which is sensitive to batch sizes. In this work, we address the problem of uncertainty quantification in deep residual networks by using a regularization technique called stochastic depth. We show that training residual networks using stochastic depth can be interpreted as a variational approximation to the intractable posterior over the weights in Bayesian neural networks. We demonstrate that by sampling from a distribution of residual networks with varying depth and shared weights, meaningful uncertainty estimates can be obtained. Moreover, compared to the original formulation of residual networks, our method produces well-calibrated softmax probabilities with only minor changes to the network's structure. We evaluate our approach on popular computer vision datasets and measure the quality of uncertainty estimates. We also test the robustness to domain shift and show that our method is able to express higher predictive uncertainty on out-of-distribution samples. Finally, we demonstrate how the proposed approach could be used to obtain uncertainty estimates in facial verification applications.