视频超分辨率的学习轨迹感知的变压器

论文标题

视频超分辨率的学习轨迹感知的变压器

Learning Trajectory-Aware Transformer for Video Super-Resolution

论文作者

Liu, Chengxu, Yang, Huan, Fu, Jianlong, Qian, Xueming

论文摘要

视频超分辨率（VSR）旨在从低分辨率（LR）对应物中恢复一系列高分辨率（HR）帧。尽管已经取得了一些进展，但在整个视频序列中有效利用时间依赖性存在巨大的挑战。现有方法通常会从有限的相邻帧（例如5或7帧）中对齐和汇总视频帧，从而阻止这些方法无法令人满意。在本文中，我们迈出了进一步的一步，以实现视频中有效的时空学习。我们为视频超分辨率（TTVSR）提出了一种新颖的轨迹感知变压器。特别是，我们将视频帧制定为几个由连续的视觉令牌组成的预先对准轨迹。对于查询令牌，只有在时空轨迹沿相关的视觉令牌上学习自我注意力。与Vanilla Vision Transformers相比，这种设计大大降低了计算成本，并使变压器能够建模远程特征。我们进一步提出了一个跨尺度特征令牌化模块，以克服远程视频中经常发生的规模变化问题。实验结果表明，在四个广泛使用的视频超分辨率基准中，通过广泛的定量和定性评估，提出的TTVSR优于最先进模型。代码和预培训模型均可在https://github.com/researchmm/ttvsr上下载。

Video super-resolution (VSR) aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts. Although some progress has been made, there are grand challenges to effectively utilize temporal dependency in entire video sequences. Existing approaches usually align and aggregate video frames from limited adjacent frames (e.g., 5 or 7 frames), which prevents these approaches from satisfactory results. In this paper, we take one step further to enable effective spatio-temporal learning in videos. We propose a novel Trajectory-aware Transformer for Video Super-Resolution (TTVSR). In particular, we formulate video frames into several pre-aligned trajectories which consist of continuous visual tokens. For a query token, self-attention is only learned on relevant visual tokens along spatio-temporal trajectories. Compared with vanilla vision Transformers, such a design significantly reduces the computational cost and enables Transformers to model long-range features. We further propose a cross-scale feature tokenization module to overcome scale-changing problems that often occur in long-range videos. Experimental results demonstrate the superiority of the proposed TTVSR over state-of-the-art models, by extensive quantitative and qualitative evaluations in four widely-used video super-resolution benchmarks. Both code and pre-trained models can be downloaded at https://github.com/researchmm/TTVSR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题