联合嵌入内核制度中的自我监督学习

论文标题

联合嵌入内核制度中的自我监督学习

Joint Embedding Self-Supervised Learning in the Kernel Regime

论文作者

Kiani, Bobak T., Balestriero, Randall, Chen, Yubei, Lloyd, Seth, LeCun, Yann

论文摘要

自我监督学习（SSL）的基本目标是产生有用的数据表示，而无需访问任何用于分类数据的标签。 SSL中的现代方法基于样本之间已知或构造的关系形成表示，在此任务上特别有效。在这里，我们旨在扩展此框架，以基于内核方法结合算法，在这些算法中，通过作用在内核特征空间上的线性图构造的嵌入。在此内核方案中，我们得出了找到对比度和非对抗性损耗函数的输出表示形式的最佳形式的方法。该过程产生一个新的表示空间，其内部产物表示为诱导的内核，该核通常与内核空间中的增强相关的点相关，否则将点差点。我们在小型数据集上分析了内核模型，以确定自我监督的学习算法的共同特征，并获得对他们在下游任务上的性能的理论见解。

The fundamental goal of self-supervised learning (SSL) is to produce useful representations of data without access to any labels for classifying the data. Modern methods in SSL, which form representations based on known or constructed relationships between samples, have been particularly effective at this task. Here, we aim to extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel. In this kernel regime, we derive methods to find the optimal form of the output representations for contrastive and non-contrastive loss functions. This procedure produces a new representation space with an inner product denoted as the induced kernel which generally correlates points which are related by an augmentation in kernel space and de-correlates points otherwise. We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题