GlobalFlownet：使用深蒸馏全球运动估算的视频稳定

论文标题

GlobalFlownet：使用深蒸馏全球运动估算的视频稳定

GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion Estimates

论文作者

James, Jerin Geo, Jain, Devansh, Rajwade, Ajit

论文摘要

外行使用手持摄像机拍摄的视频包含不良动作的动作。以不受移动对象影响的方式估算连续帧之间的全球运动对于许多视频稳定技术至关重要，但带来了重大挑战。大量的工作使用2D仿射转换或同型全球运动。但是，在这项工作中，我们引入了一个更一般的表示方案，该方案适应任何现有的光流网络以忽略移动对象并获得视频帧之间全局运动的空间平滑近似。我们通过一种知识蒸馏方法实现了这一目标，在该方法中，我们首先将低通滤波器模块引入光流网络，以限制要在空间平滑的预测光流。这成为我们的学生网络，称为\ textsc {globalflownet}。然后，使用原始的光流网络作为教师网络，我们使用强大的损失功能训练学生网络。给定经过训练的\ textsc {globalflownet}，我们使用两个阶段过程稳定视频。在第一阶段，我们使用二次编程方法校正仿射参数的不稳定性，该方法受用户指定的裁剪限制的约束，以控制视场的控制丢失。在第二阶段，我们通过平滑整体运动参数来进一步稳定视频，并使用少量离散的余弦变换系数表示。在各种不同视频的广泛实验中，我们的技术在主观质量和视频稳定性的不同定量测量方面优于最先进的技术。源代码可在\ href {https://github.com/globalflownet/globalflownet} {https://github.com/globalflownet/globalflownet/globalflownet} {

Videos shot by laymen using hand-held cameras contain undesirable shaky motion. Estimating the global motion between successive frames, in a manner not influenced by moving objects, is central to many video stabilization techniques, but poses significant challenges. A large body of work uses 2D affine transformations or homography for the global motion. However, in this work, we introduce a more general representation scheme, which adapts any existing optical flow network to ignore the moving objects and obtain a spatially smooth approximation of the global motion between video frames. We achieve this by a knowledge distillation approach, where we first introduce a low pass filter module into the optical flow network to constrain the predicted optical flow to be spatially smooth. This becomes our student network, named as \textsc{GlobalFlowNet}. Then, using the original optical flow network as the teacher network, we train the student network using a robust loss function. Given a trained \textsc{GlobalFlowNet}, we stabilize videos using a two stage process. In the first stage, we correct the instability in affine parameters using a quadratic programming approach constrained by a user-specified cropping limit to control loss of field of view. In the second stage, we stabilize the video further by smoothing global motion parameters, expressed using a small number of discrete cosine transform coefficients. In extensive experiments on a variety of different videos, our technique outperforms state of the art techniques in terms of subjective quality and different quantitative measures of video stability. The source code is publicly available at \href{https://github.com/GlobalFlowNet/GlobalFlowNet}{https://github.com/GlobalFlowNet/GlobalFlowNet}

下载PDF全文

下载文献需遵守相关版权规定

论文标题