使用可学习的运动模型和遮挡的深度时间融合框架，用于场景流动

论文标题

使用可学习的运动模型和遮挡的深度时间融合框架，用于场景流动

A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions

论文作者

Schuster, René, Unger, Christian, Stricker, Didier

论文摘要

运动估计是计算机视觉中的核心挑战之一。借助传统的双框方法，遮挡和观看外动作是一个限制因素，尤其是在环境感知的背景下，由于物体的大量运动（自我）运动。我们的工作提出了一种新型的数据驱动方法，用于在多帧设置中进行场景流估计的时间融合，以克服遮挡问题。与大多数以前的方法相反，我们不依赖恒定的运动模型，而是从数据中学习一般的运动时间关系。在第二步中，神经网络结合了共同参考框架的双向场景流量估计，从而产生了精致的估计值和闭塞面膜的自然副产品。这样，我们的方法为各种场景流估计器提供了快速的多帧扩展，这表现优于基础双框架方法。

Motion estimation is one of the core challenges in computer vision. With traditional dual-frame approaches, occlusions and out-of-view motions are a limiting factor, especially in the context of environmental perception for vehicles due to the large (ego-) motion of objects. Our work proposes a novel data-driven approach for temporal fusion of scene flow estimates in a multi-frame setup to overcome the issue of occlusion. Contrary to most previous methods, we do not rely on a constant motion model, but instead learn a generic temporal relation of motion from data. In a second step, a neural network combines bi-directional scene flow estimates from a common reference frame, yielding a refined estimate and a natural byproduct of occlusion masks. This way, our approach provides a fast multi-frame extension for a variety of scene flow estimators, which outperforms the underlying dual-frame approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题