鲁棒性意味着通过数据依赖性概括界限的概括

论文标题

鲁棒性意味着通过数据依赖性概括界限的概括

Robustness Implies Generalization via Data-Dependent Generalization Bounds

论文作者

Kawaguchi, Kenji, Deng, Zhun, Luh, Kyle, Huang, Jiaoyang

论文摘要

本文证明了鲁棒性意味着通过数据依赖性的概括界限进行概括。结果，鲁棒性和概括被证明是以数据依赖性方式紧密连接的。我们的界限改善了以前的两个方向的界限，以解决自2010年以来几乎没有开发的开放问题。第一个是减少对覆盖号码的依赖。第二个是消除对假设空间的依赖性。我们提供了几个示例，包括套索和深度学习的例子，其中我们的界限是可取的。关于现实世界数据和理论模型的实验表明，在各种情况下的近乎指数改进。为了实现这些改进，我们不需要对未知分布的其他假设。取而代之的是，我们仅包含训练样本的可观察到的可计算特性。一个关键的技术创新是对多项式随机变量的改善浓度，这是鲁棒性和泛化的独立兴趣。

This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second is to remove the dependence on the hypothesis space. We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable. The experiments on real-world data and theoretical models demonstrate near-exponential improvements in various situations. To achieve these improvements, we do not require additional assumptions on the unknown distribution; instead, we only incorporate an observable and computable property of the training samples. A key technical innovation is an improved concentration bound for multinomial random variables that is of independent interest beyond robustness and generalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题