扩展视频动作识别的时间数据增加

论文标题

扩展视频动作识别的时间数据增加

Extending Temporal Data Augmentation for Video Action Recognition

论文作者

Gorpincenko, Artjoms, Mackiewicz, Michal

论文摘要

Pixel空间的增加在许多深度学习领域的流行，其有效性，简单性和计算成本较低。但是，视频的数据增强仍然仍然是一个爆炸案的研究主题，因为大多数作品都将输入视为静态图像的堆栈，而不是时间链接的一系列数据。最近，已经表明，在设计增强时涉及时间维度可以优于其仅空间变体以识别视频动作识别。在本文中，我们对这些技术提出了几种新颖的增强，以加强空间和时间领域之间的关系并获得更深层次的扰动。我们技术的视频动作识别结果在UCF-101和HMDB-51数据集上的TOP-1和TOP-5设置中优于它们各自的变体。

Pixel space augmentation has grown in popularity in many Deep Learning areas, due to its effectiveness, simplicity, and low computational cost. Data augmentation for videos, however, still remains an under-explored research topic, as most works have been treating inputs as stacks of static images rather than temporally linked series of data. Recently, it has been shown that involving the time dimension when designing augmentations can be superior to its spatial-only variants for video action recognition. In this paper, we propose several novel enhancements to these techniques to strengthen the relationship between the spatial and temporal domains and achieve a deeper level of perturbations. The video action recognition results of our techniques outperform their respective variants in Top-1 and Top-5 settings on the UCF-101 and the HMDB-51 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题