完美平衡：改善受监督对比学习的转移和鲁棒性

论文标题

完美平衡：改善受监督对比学习的转移和鲁棒性

Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning

论文作者

Chen, Mayee F., Fu, Daniel Y., Narayan, Avanika, Zhang, Michael, Song, Zhao, Fatahalian, Kayvon, Ré, Christopher

论文摘要

理想的学说的表示应显示可转移性和鲁棒性。监督对比学习（SUPCON）是一种有前途的训练准确模型的方法，但是产生的表示形式不会因班级崩溃而捕获这些属性 - 当类图中的所有点均指向相同的表示形式时。最近的工作表明，“散布”这些表示可以改善它们，但是确切的机制知之甚少。我们认为，单独创建点差不足以进行更好的表示，因为差异对于班级的排列不变。取而代之的是，有必要正确的传播程度和破坏此不变性的机制。我们首先证明，添加加权类条件的信息损失以控制传播程度。接下来，我们研究三种破坏置换不变性的机制：使用约束的编码器，添加类条件自动编码器并使用数据增强。我们表明，后两个人在更现实的条件下鼓励潜在子类聚类，而不是前者。使用这些见解，我们表明，在5个标准数据集中添加适当加权的集体条件infonce损失和一个班级条件自动编码器，以在5个标准数据集中进行粗到5个标准数据集，在3个数据集上进行较差的鲁棒性，在3个数据集上进行粗到4.7点，在Celeba by celeba by 11.5点上列出了4.7分。

An ideal learned representation should display transferability and robustness. Supervised contrastive learning (SupCon) is a promising method for training accurate models, but produces representations that do not capture these properties due to class collapse -- when all points in a class map to the same representation. Recent work suggests that "spreading out" these representations improves them, but the precise mechanism is poorly understood. We argue that creating spread alone is insufficient for better representations, since spread is invariant to permutations within classes. Instead, both the correct degree of spread and a mechanism for breaking this invariance are necessary. We first prove that adding a weighted class-conditional InfoNCE loss to SupCon controls the degree of spread. Next, we study three mechanisms to break permutation invariance: using a constrained encoder, adding a class-conditional autoencoder, and using data augmentation. We show that the latter two encourage clustering of latent subclasses under more realistic conditions than the former. Using these insights, we show that adding a properly-weighted class-conditional InfoNCE loss and a class-conditional autoencoder to SupCon achieves 11.1 points of lift on coarse-to-fine transfer across 5 standard datasets and 4.7 points on worst-group robustness on 3 datasets, setting state-of-the-art on CelebA by 11.5 points.

下载PDF全文

下载文献需遵守相关版权规定

论文标题