使用样品评估的CMI的新概括界限

论文标题

使用样品评估的CMI的新概括界限

A New Family of Generalization Bounds Using Samplewise Evaluated CMI

论文作者

Hellström, Fredrik, Durisi, Giuseppe

论文摘要

我们提出了一个新的信息理论泛化界限，其中训练损失和人口损失通过共同的凸功能进行比较。该函数是根据分解的样本，评估的条件互信息（CMI）的上限，该信息度量取决于所选假设所产生的损失，而不是基于假设本身，这在大致正确的（PAC） - bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes-bayes。我们通过恢复和扩展先前已知的信息理论界限来证明该框架的一般性。此外，使用评估的CMI，我们得出了Seeger的Pac-Bayesian Bound的样品，平均版本，其中凸函数是二进制KL差异。在某些情况下，这种新颖的约束会导致与以前的界限相比，深度神经网络的人群丧失的表征更严格。最后，我们得出了其中一些平均界限的高概率版本。我们通过使用有限的natarajan维度来恢复多类分类的平均和高概率的概括界，证明了评估的CMI界限的统一性质。

We present a new family of information-theoretic generalization bounds, in which the training loss and the population loss are compared through a jointly convex function. This function is upper-bounded in terms of the disintegrated, samplewise, evaluated conditional mutual information (CMI), an information measure that depends on the losses incurred by the selected hypothesis, rather than on the hypothesis itself, as is common in probably approximately correct (PAC)-Bayesian results. We demonstrate the generality of this framework by recovering and extending previously known information-theoretic bounds. Furthermore, using the evaluated CMI, we derive a samplewise, average version of Seeger's PAC-Bayesian bound, where the convex function is the binary KL divergence. In some scenarios, this novel bound results in a tighter characterization of the population loss of deep neural networks than previous bounds. Finally, we derive high-probability versions of some of these average bounds. We demonstrate the unifying nature of the evaluated CMI bounds by using them to recover average and high-probability generalization bounds for multiclass classification with finite Natarajan dimension.

下载PDF全文

下载文献需遵守相关版权规定

论文标题