论文标题
神经网络激活中的隐藏不确定性
The Hidden Uncertainty in a Neural Networks Activations
论文作者
论文摘要
神经网络潜在表示的分布已成功地用于检测分布(OOD)数据。这项工作研究了该分布是否与模型的认知不确定性相关,因此表明其能够概括为新的投入。首先,我们从经验上验证了认知不确定性可以以惊喜(因此观察特定潜在表示)的惊喜(因此是负模拟的)。此外,我们证明,隐藏表示形式的输出条件分布还允许通过预测分布的熵来量化不确定性。我们分析了从不同层的表示中推断出的认知和态度不确定性,并得出结论,更深的层导致不确定性,其行为与确定的相似的行为 - 但计算更昂贵的方法(例如,深层集合)。尽管我们的方法不需要修改培训过程,但我们遵循先前的工作,并进行额外的正规化损失,从而增加了潜在表示中的信息。我们发现,这导致了OOD检测到认知不确定性的检测,以模棱两可的校准成本接近数据分布。我们在分类和回归模型上验证我们的发现。
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data. This work investigates whether this distribution moreover correlates with a model's epistemic uncertainty, thus indicates its ability to generalise to novel inputs. We first empirically verify that epistemic uncertainty can be identified with the surprise, thus the negative log-likelihood, of observing a particular latent representation. Moreover, we demonstrate that the output-conditional distribution of hidden representations also allows quantifying aleatoric uncertainty via the entropy of the predictive distribution. We analyse epistemic and aleatoric uncertainty inferred from the representations of different layers and conclude that deeper layers lead to uncertainty with similar behaviour as established - but computationally more expensive - methods (e.g. deep ensembles). While our approach does not require modifying the training process, we follow prior work and experiment with an additional regularising loss that increases the information in the latent representations. We find that this leads to improved OOD detection of epistemic uncertainty at the cost of ambiguous calibration close to the data distribution. We verify our findings on both classification and regression models.