论文标题
探索学习视频压缩的长期和短期时间信息
Exploring Long- and Short-Range Temporal Information for Learned Video Compression
论文作者
论文摘要
自学的视频压缩方法已经在视频编码社区中引起了各种兴趣,因为它们已经匹配甚至超过了传统视频编解码器的速率 - 延伸(RD)性能。但是,许多当前基于学习的方法致力于利用短期时间信息,从而限制其性能。在本文中,我们专注于利用视频内容的独特特征,并进一步探索时间信息以增强压缩性能。具体而言,对于远程时间信息开发,我们提出了时间上的先验,可以在推断过程中在图片组(GOP)中连续更新。在这种情况下,时间先验包含当前共和党中所有解码图像的宝贵时间信息。至于短期时间信息,我们提出了进行渐进的指导运动补偿,以实现强大而有效的补偿。详细说明,我们设计了一个分层结构,以实现多尺度的补偿。更重要的是,我们使用光流引导来在每个量表的特征图之间生成像素偏移,每个尺度上的补偿结果将用于指导以下规模的补偿。足够的实验结果表明,与最先进的视频压缩方法相比,我们的方法可以获得更好的RD性能。该代码可公开可用:https://github.com/huairui/lstvc。
Learned video compression methods have gained a variety of interest in the video coding community since they have matched or even exceeded the rate-distortion (RD) performance of traditional video codecs. However, many current learning-based methods are dedicated to utilizing short-range temporal information, thus limiting their performance. In this paper, we focus on exploiting the unique characteristics of video content and further exploring temporal information to enhance compression performance. Specifically, for long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference. In that case temporal prior contains valuable temporal information of all decoded images within the current GOP. As for short-range temporal information, we propose a progressive guided motion compensation to achieve robust and effective compensation. In detail, we design a hierarchical structure to achieve multi-scale compensation. More importantly, we use optical flow guidance to generate pixel offsets between feature maps at each scale, and the compensation results at each scale will be used to guide the following scale's compensation. Sufficient experimental results demonstrate that our method can obtain better RD performance than state-of-the-art video compression approaches. The code is publicly available on: https://github.com/Huairui/LSTVC.