论文标题
自我监督学习的多视图观点
A Multi-view Perspective of Self-supervised Learning
论文作者
论文摘要
作为一个新出现的无监督学习范式,自我监督的学习(SSL)最近引起了广泛的关注,这通常在没有数据注释的情况下引入了借口任务。在其帮助下,SSL有效地学习了功能表示有益于下游任务。因此,借口任务起着关键作用。但是,对其设计的研究,尤其是目前的本质仍然是开放的。在本文中,我们借用了多视图的观点,将一类流行的借口任务解除为视图数据增强(VDA)和查看标签分类(VLC)的组合,我们试图在其中探索这种借口任务的本质,同时为其设计提供一些见解。具体而言,一个简单的多视图学习框架是专门设计的(SSL-MV),该框架通过增强视图上的相同任务来帮助下游任务的功能学习(原始视图)。 SSL-MV专注于VDA,而Abandons VLC则在经验上发现,VDA是VDA而不是普遍认为的VLC来主导此类SSL的性能。此外,由于用VDA任务替换了VLC,SSL-MV还可以进行集成的推理,结合了增强视图中的预测,从而进一步提高了性能。几个基准数据集的实验证明了其优势。
As a newly emerging unsupervised learning paradigm, self-supervised learning (SSL) recently gained widespread attention, which usually introduces a pretext task without manual annotation of data. With its help, SSL effectively learns the feature representation beneficial for downstream tasks. Thus the pretext task plays a key role. However, the study of its design, especially its essence currently is still open. In this paper, we borrow a multi-view perspective to decouple a class of popular pretext tasks into a combination of view data augmentation (VDA) and view label classification (VLC), where we attempt to explore the essence of such pretext task while providing some insights into its design. Specifically, a simple multi-view learning framework is specially designed (SSL-MV), which assists the feature learning of downstream tasks (original view) through the same tasks on the augmented views. SSL-MV focuses on VDA while abandons VLC, empirically uncovering that it is VDA rather than generally considered VLC that dominates the performance of such SSL. Additionally, thanks to replacing VLC with VDA tasks, SSL-MV also enables an integrated inference combining the predictions from the augmented views, further improving the performance. Experiments on several benchmark datasets demonstrate its advantages.