论文标题
使用可学习的运动模型和遮挡的深度时间融合框架,用于场景流动
A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions
论文作者
论文摘要
运动估计是计算机视觉中的核心挑战之一。借助传统的双框方法,遮挡和观看外动作是一个限制因素,尤其是在环境感知的背景下,由于物体的大量运动(自我)运动。我们的工作提出了一种新型的数据驱动方法,用于在多帧设置中进行场景流估计的时间融合,以克服遮挡问题。与大多数以前的方法相反,我们不依赖恒定的运动模型,而是从数据中学习一般的运动时间关系。在第二步中,神经网络结合了共同参考框架的双向场景流量估计,从而产生了精致的估计值和闭塞面膜的自然副产品。这样,我们的方法为各种场景流估计器提供了快速的多帧扩展,这表现优于基础双框架方法。
Motion estimation is one of the core challenges in computer vision. With traditional dual-frame approaches, occlusions and out-of-view motions are a limiting factor, especially in the context of environmental perception for vehicles due to the large (ego-) motion of objects. Our work proposes a novel data-driven approach for temporal fusion of scene flow estimates in a multi-frame setup to overcome the issue of occlusion. Contrary to most previous methods, we do not rely on a constant motion model, but instead learn a generic temporal relation of motion from data. In a second step, a neural network combines bi-directional scene flow estimates from a common reference frame, yielding a refined estimate and a natural byproduct of occlusion masks. This way, our approach provides a fast multi-frame extension for a variety of scene flow estimators, which outperforms the underlying dual-frame approaches.