论文标题
利用基于骨架的动作识别的时空依赖性
Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition
论文作者
论文摘要
基于骨架的动作识别因其对人体骨骼骨骼的紧凑表示,引起了很大的关注。许多最近的方法使用图卷积网络(GCN)和卷积神经网络(CNN)实现了出色的性能,分别提取空间和时间特征。尽管已经分别探索了人类骨骼中的空间和时间依赖性,但很少考虑时空依赖性。在本文中,我们建议时空曲线网络(STC-NET)有效利用人类骨架的时空依赖性。我们提出的网络由两个新的元素组成:1)时空曲线(STC)模块; 2)图形卷积的扩张核(DK-GC)。 STC模块通过识别每个相邻帧之间的有意义的节点连接并基于确定的节点连接生成时空曲线,从而动态调整了接受场,从而提供了自适应时空覆盖率。此外,我们建议DK-GC考虑长距离依赖性,这通过将扩展的内核应用于图的给定邻接矩阵,从而导致一个大型的接受场,而没有任何其他参数。我们的STC-NET结合了这两个模块,并在四个基于骨架的动作识别基准上实现了最先进的性能。
Skeleton-based action recognition has attracted considerable attention due to its compact representation of the human body's skeletal sructure. Many recent methods have achieved remarkable performance using graph convolutional networks (GCNs) and convolutional neural networks (CNNs), which extract spatial and temporal features, respectively. Although spatial and temporal dependencies in the human skeleton have been explored separately, spatio-temporal dependency is rarely considered. In this paper, we propose the Spatio-Temporal Curve Network (STC-Net) to effectively leverage the spatio-temporal dependency of the human skeleton. Our proposed network consists of two novel elements: 1) The Spatio-Temporal Curve (STC) module; and 2) Dilated Kernels for Graph Convolution (DK-GC). The STC module dynamically adjusts the receptive field by identifying meaningful node connections between every adjacent frame and generating spatio-temporal curves based on the identified node connections, providing an adaptive spatio-temporal coverage. In addition, we propose DK-GC to consider long-range dependencies, which results in a large receptive field without any additional parameters by applying an extended kernel to the given adjacency matrices of the graph. Our STC-Net combines these two modules and achieves state-of-the-art performance on four skeleton-based action recognition benchmarks.