论文标题
您的对比学习是秘密地进行随机邻居嵌入
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
论文作者
论文摘要
对比度学习,尤其是自我监督的对比学习(SSCL),在从未标记的数据中提取强大的功能方面取得了巨大的成功。在这项工作中,我们有助于对SSCL的理论理解,并发现其与经典数据可视化方法,随机邻居嵌入(SNE)的联系,其目标是保持成对距离。从保留邻近信息的角度来看,SSCL可以视为SNE的特殊情况,其输入空间成对相似性是数据扩展指定的。既定的对应关系促进了对SSCL学习特征的更深入的理论理解,以及实际改进的方法论指南。具体而言,通过SNE的镜头,我们提供了有关领域 - 不合稳定的增强,隐性偏见和学习特征的鲁棒性的新分析。为了说明实际优势,我们证明了SSCL环境中也可以采用从SNE到$ t $ -SNE的修改,从而在分布和分发泛滥的概括方面取得了重大改进。
Contrastive learning, especially self-supervised contrastive learning (SSCL), has achieved great success in extracting powerful features from unlabeled data. In this work, we contribute to the theoretical understanding of SSCL and uncover its connection to the classic data visualization method, stochastic neighbor embedding (SNE), whose goal is to preserve pairwise distances. From the perspective of preserving neighboring information, SSCL can be viewed as a special case of SNE with the input space pairwise similarities specified by data augmentation. The established correspondence facilitates deeper theoretical understanding of learned features of SSCL, as well as methodological guidelines for practical improvement. Specifically, through the lens of SNE, we provide novel analysis on domain-agnostic augmentations, implicit bias and robustness of learned features. To illustrate the practical advantage, we demonstrate that the modifications from SNE to $t$-SNE can also be adopted in the SSCL setting, achieving significant improvement in both in-distribution and out-of-distribution generalization.