CGCV：光流神经网络的上下文指导相关量

论文标题

CGCV：光流神经网络的上下文指导相关量

CGCV:Context Guided Correlation Volume for Optical Flow Neural Networks

论文作者

Li, Jiangpeng, Niu, Yan

论文摘要

光流从一对视频帧中计算明显运动的光流是场景运动估算的关键工具。相关量是光流计算神经模型的核心组成部分。它估计跨框架功能之间的成对匹配成本，然后用于解码光流。但是，传统的相关量通常是嘈杂的，易于异常的，并且对运动模糊敏感。我们观察到，尽管最近的筏算法也采用了传统的相关量，但其附加上下文编码器为流量解码器提供了语义代表性的特征，可以隐含地补偿相关量的不足。但是，几乎没有讨论或利用这种环境编码器的好处。在本文中，我们首先研究了Raft上下文编码器的功能，然后通过门控和提起方案提出了一个新的背景相关量（CGCV）。 CGCV可以通过基于筏的流量计算方法普遍整合，以增强性能，尤其是在运动模糊，脱焦片模糊和大气效应的情况下有效的。通过将拟议的CGCV与以前的全球运动聚合（GMA）方法合并，以额外的参数为0.5％，GMA的排名由Kitti 2015 2015 Leads Leads委员会的23个位置，Sintel Leads Leads委员会的3位。此外，在模型大小相似的情况下，我们的相关量与采用变压器或图形推理的最先进的监督模型具有竞争性或卓越的性能，这是通过广泛的实验验证的。

Optical flow, which computes the apparent motion from a pair of video frames, is a critical tool for scene motion estimation. Correlation volume is the central component of optical flow computational neural models. It estimates the pairwise matching costs between cross-frame features, and is then used to decode optical flow. However, traditional correlation volume is frequently noisy, outlier-prone, and sensitive to motion blur. We observe that, although the recent RAFT algorithm also adopts the traditional correlation volume, its additional context encoder provides semantically representative features to the flow decoder, implicitly compensating for the deficiency of the correlation volume. However, the benefits of this context encoder has been barely discussed or exploited. In this paper, we first investigate the functionality of RAFT's context encoder, then propose a new Context Guided Correlation Volume (CGCV) via gating and lifting schemes. CGCV can be universally integrated with RAFT-based flow computation methods for enhanced performance, especially effective in the presence of motion blur, de-focus blur and atmospheric effects. By incorporating the proposed CGCV with previous Global Motion Aggregation (GMA) method, at a minor cost of 0.5% extra parameters, the rank of GMA is lifted by 23 places on KITTI 2015 Leader Board, and 3 places on Sintel Leader Board. Moreover, at a similar model size, our correlation volume achieves competitive or superior performance to state of the art peer supervised models that employ Transformers or Graph Reasoning, as verified by extensive experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题