论文标题
深度学习不确定性量化程序的经验频繁覆盖
Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
论文作者
论文摘要
复杂深度学习模型的不确定性量化越来越重要,因为这些技术在高风险,现实世界中的使用日益增长。当前,使用点预测指标(例如负模样)或Holdout数据上的Brier得分来评估模型不确定性的质量。在这项研究中,我们提供了对众所周知的不确定性定量技术在一系列回归和分类任务上的经验频繁覆盖属性的首次大规模评估。我们发现,通常,某些方法确实在分布样本中实现了理想的覆盖范围,但是该覆盖范围并未在分布数据范围内维持。我们的结果表明,随着数据集偏移的增加并确定覆盖范围是开发现实世界应用模型的重要指标,当前不确定性量化技术的失败。
Uncertainty quantification for complex deep learning models is increasingly important as these techniques see growing use in high-stakes, real-world settings. Currently, the quality of a model's uncertainty is evaluated using point-prediction metrics such as negative log-likelihood or the Brier score on heldout data. In this study, we provide the first large scale evaluation of the empirical frequentist coverage properties of well known uncertainty quantification techniques on a suite of regression and classification tasks. We find that, in general, some methods do achieve desirable coverage properties on in distribution samples, but that coverage is not maintained on out-of-distribution data. Our results demonstrate the failings of current uncertainty quantification techniques as dataset shift increases and establish coverage as an important metric in developing models for real-world applications.