论文标题

评估了元学习的CMI界限:紧密和表现力

Evaluated CMI Bounds for Meta Learning: Tightness and Expressiveness

论文作者

Hellström, Fredrik, Durisi, Giuseppe

论文摘要

最近的工作确定,Steinke和Zakynthinou(2020)的条件互信息(CMI)框架足以捕获算法稳定性,VC维度和相关的复杂性测量的概括保证,用于传统学习(Harutyunyan et al。因此,它提供了一种建立概括界限的统一方法。在元学习中,到目前为止,信息理论结果与古典学习理论的结果之间存在鸿沟。在这项工作中,我们朝着弥合这一鸿沟迈出了第一步。具体而言,我们根据评估的CMI(E-CMI)提出了用于元学习的新型概括界。为了证明E-CMI框架的表现力,我们将界限应用于表示设置,并从$ \ hat n $ tasks中使用$ n $ samples通过表单$ f_i \ circ h $的函数进行参数化的任务。在这里,每个$ f_i \ in \ Mathcal f $是一个特定于任务的函数,\ Mathcal H $中的$ H \是共享表示。 For this setup, we show that the e-CMI framework yields a bound that scales as $\sqrt{ \mathcal C(\mathcal H)/(n\hat n) + \mathcal C(\mathcal F)/n} $, where $\mathcal C(\cdot)$ denotes a complexity measure of the hypothesis class.这种缩放行为与Tripuraneni等人报道的行为一致。 (2020)使用高斯复杂性。

Recent work has established that the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020) is expressive enough to capture generalization guarantees in terms of algorithmic stability, VC dimension, and related complexity measures for conventional learning (Harutyunyan et al., 2021, Haghifam et al., 2021). Hence, it provides a unified method for establishing generalization bounds. In meta learning, there has so far been a divide between information-theoretic results and results from classical learning theory. In this work, we take a first step toward bridging this divide. Specifically, we present novel generalization bounds for meta learning in terms of the evaluated CMI (e-CMI). To demonstrate the expressiveness of the e-CMI framework, we apply our bounds to a representation learning setting, with $n$ samples from $\hat n$ tasks parameterized by functions of the form $f_i \circ h$. Here, each $f_i \in \mathcal F$ is a task-specific function, and $h \in \mathcal H$ is the shared representation. For this setup, we show that the e-CMI framework yields a bound that scales as $\sqrt{ \mathcal C(\mathcal H)/(n\hat n) + \mathcal C(\mathcal F)/n} $, where $\mathcal C(\cdot)$ denotes a complexity measure of the hypothesis class. This scaling behavior coincides with the one reported in Tripuraneni et al. (2020) using Gaussian complexity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源