Adafnio：视频框架插值的自适应傅立叶神经插值操作员

论文标题

Adafnio：视频框架插值的自适应傅立叶神经插值操作员

AdaFNIO: Adaptive Fourier Neural Interpolation Operator for video frame interpolation

论文作者

Viswanath, Hrishikesh, Rahman, Md Ashiqur, Bhaskara, Rashmi, Bera, Aniket

论文摘要

我们介绍了Adafnio-自适应傅立叶神经插值操作员，这是一种基于神经操作的建筑，可执行视频框架插值。当前的基于深度学习的方法依赖于当地的卷积来进行特征学习，并且不受规模不变的影响，因此需要通过随机翻转和重新缩放来增强培训数据。另一方面，Adafnio通过使用快速的傅立叶变换（FFT）来了解框架中的特征，独立于输入分辨率，通过令牌混合和全局卷积。我们表明，Adafnio可以产生视觉平滑，准确的结果。为了评估插值框架的视觉质量，我们计算了生成的框架和地面真实框架之间的结构相似性指数（SSIM）和峰信号与噪声比（PSNR）。我们在VIMEO-90K数据集，Davis，UCF101和DISFA+数据集上提供了模型的定量性能。

We present, AdaFNIO - Adaptive Fourier Neural Interpolation Operator, a neural operator-based architecture to perform video frame interpolation. Current deep learning based methods rely on local convolutions for feature learning and suffer from not being scale-invariant, thus requiring training data to be augmented through random flipping and re-scaling. On the other hand, AdaFNIO, learns the features in the frames, independent of input resolution, through token mixing and global convolution in the Fourier space or the spectral domain by using Fast Fourier Transform (FFT). We show that AdaFNIO can produce visually smooth and accurate results. To evaluate the visual quality of our interpolated frames, we calculate the structural similarity index (SSIM) and Peak Signal to Noise Ratio (PSNR) between the generated frame and the ground truth frame. We provide the quantitative performance of our model on Vimeo-90K dataset, DAVIS, UCF101 and DISFA+ dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题