论文标题
通过密集的相互关注的相互作用手动对象姿势估计
Interacting Hand-Object Pose Estimation via Dense Mutual Attention
论文作者
论文摘要
3D手对象姿势估计是许多计算机视觉应用程序成功的关键。该任务的主要重点是有效地建模手和对象之间的相互作用。为此,现有作品要么依赖于计算廉价的迭代优化中的交互约束,要么仅考虑采样手和对象关键点之间的稀疏相关性。相比之下,我们提出了一种新型的密集相互注意机制,该机制能够模拟手和物体之间的细粒依赖性。具体而言,我们首先根据其网格结构构造手和对象图。对于每个手节点,我们通过学习的注意来汇总每个对象节点的特征,反之亦然。由于如此密集的相互关注,我们的方法能够以高质量和实时推理速度产生物理上合理的姿势。大型基准数据集上的广泛定量和定性实验表明,我们的方法的表现优于最先进的方法。该代码可在https://github.com/rongakowang/densemutualattention.git上找到。
3D hand-object pose estimation is the key to the success of many computer vision applications. The main focus of this task is to effectively model the interaction between the hand and an object. To this end, existing works either rely on interaction constraints in a computationally-expensive iterative optimization, or consider only a sparse correlation between sampled hand and object keypoints. In contrast, we propose a novel dense mutual attention mechanism that is able to model fine-grained dependencies between the hand and the object. Specifically, we first construct the hand and object graphs according to their mesh structures. For each hand node, we aggregate features from every object node by the learned attention and vice versa for each object node. Thanks to such dense mutual attention, our method is able to produce physically plausible poses with high quality and real-time inference speed. Extensive quantitative and qualitative experiments on large benchmark datasets show that our method outperforms state-of-the-art methods. The code is available at https://github.com/rongakowang/DenseMutualAttention.git.