学习，压缩和泄漏：通过元冠压原理最小化分类错误

论文标题

学习，压缩和泄漏：通过元冠压原理最小化分类错误

Learning, compression, and leakage: Minimising classification error via meta-universal compression principles

论文作者

Rosas, Fernando E., Mediano, Pedro A. M., Gastpar, Michael

论文摘要

学习和压缩是由识别和利用数据中统计规律的普遍目的驱动的，这为这些领域之间的肥沃协作打开了大门。一组有前途的学习方案压缩技术是标准化的最大似然（NML）编码，可为小型数据集的压缩提供强大的保证 - 与更受欢迎的估计器相比，其保证仅在渐近极限中保证。在这里，我们考虑了一种基于NML的监督分类问题的决策策略，并表明当应用于各种模型时，它可以实现启发式PAC学习。此外，我们表明，我们方法的错误分类速率受到最大泄漏的上限，这是最近提出的指标，旨在量化数据泄漏的潜力在隐私敏感的情况下。

Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression of small datasets - in contrast with more popular estimators whose guarantees hold only in the asymptotic limit. Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains heuristic PAC learning when applied to a wide variety of models. Furthermore, we show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题