基于骨架的动作识别的自举代表学习

论文标题

基于骨架的动作识别的自举代表学习

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

论文作者

Moliner, Olivier, Huang, Sangxia, Åström, Kalle

论文摘要

在这项工作中，我们研究了基于3D骨架的动作识别的自我监督的表示学习。我们扩展了您自己的潜在（BYOL）以在骨架序列数据上进行表示，并提出了一种新的数据增强策略，包括两个不对称转换管道。我们还引入了一种多视频采样方法，该方法利用了不同摄像机捕获的相同动作的多个观看角度。在半监督的环境中，我们表明，通过更广泛的网络的知识蒸馏可以进一步提高性能，从而再次利用未标记的样本。我们对NTU-60和NTU-1220数据集进行了广泛的实验，以证明我们提出的方法的性能。我们的方法在线性评估和半监督基准方面始终优于当前的艺术状态。

In this work, we study self-supervised representation learning for 3D skeleton-based action recognition. We extend Bootstrap Your Own Latent (BYOL) for representation learning on skeleton sequence data and propose a new data augmentation strategy including two asymmetric transformation pipelines. We also introduce a multi-viewpoint sampling method that leverages multiple viewing angles of the same action captured by different cameras. In the semi-supervised setting, we show that the performance can be further improved by knowledge distillation from wider networks, leveraging once more the unlabeled samples. We conduct extensive experiments on the NTU-60 and NTU-120 datasets to demonstrate the performance of our proposed method. Our method consistently outperforms the current state of the art on both linear evaluation and semi-supervised benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题