论文标题
SCVRL:洗牌的对比度视频表示学习
SCVRL: Shuffled Contrastive Video Representation Learning
论文作者
论文摘要
我们提出了SCVRL,这是一种基于对比的新型框架,用于视频的自我监督学习。与以前的基于对比度学习的方法不同,主要集中于学习视觉语义(例如CVRL),SCVRL能够同时学习语义和运动模式。为此,我们重新制定了现代对比度学习范式中流行的借口任务。我们表明,我们的基于变压器的网络具有自然能力,可以在自我监督的设置中学习运动,并实现强大的性能,在四个基准上表现优于CVRL。
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCVRL is capable of learning both semantic and motion patterns. For that, we reformulate the popular shuffling pretext task within a modern contrastive learning paradigm. We show that our transformer-based network has a natural capacity to learn motion in self-supervised settings and achieves strong performance, outperforming CVRL on four benchmarks.