论文标题
链式的概括范围
Chained Generalisation Bounds
论文作者
论文摘要
这项工作讨论了如何通过链接技术导致监督学习算法的预期概括误差的上限。通过开发一个一般的理论框架,我们根据损失函数的规律性及其链式对应物建立二元性界限,这可以通过将规律性假设从损失提升到其梯度来获得。这使我们能够根据Wasserstein距离和其他概率指标重新介绍从文献中绑定的链式相互信息,并获得新颖的链接信息理论概括界。我们在一些玩具示例中表明,链式的概括结合可能比其标准对应物明显更紧,尤其是当算法选择的假设的分布非常集中时。 关键字:概括范围;链接;信息理论范围;相互信息;瓦斯尔斯坦的距离; Pac-Bayes。
This work discusses how to derive upper bounds for the expected generalisation error of supervised learning algorithms by means of the chaining technique. By developing a general theoretical framework, we establish a duality between generalisation bounds based on the regularity of the loss function, and their chained counterparts, which can be obtained by lifting the regularity assumption from the loss onto its gradient. This allows us to re-derive the chaining mutual information bound from the literature, and to obtain novel chained information-theoretic generalisation bounds, based on the Wasserstein distance and other probability metrics. We show on some toy examples that the chained generalisation bound can be significantly tighter than its standard counterpart, particularly when the distribution of the hypotheses selected by the algorithm is very concentrated. Keywords: Generalisation bounds; Chaining; Information-theoretic bounds; Mutual information; Wasserstein distance; PAC-Bayes.