类别级别6D对象的对象级别深度重建从单眼RGB图像估计

论文标题

类别级别6D对象的对象级别深度重建从单眼RGB图像估计

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation From Monocular RGB Image

论文作者

Fan, Zhaoxin, Song, Zhenbo, Xu, Jian, Wang, Zhicheng, Wu, Kejian, Liu, Hongyan, He, Jun

论文摘要

最近，基于RGBD的类别级别6D对象姿势估计已实现了有希望的提高性能，但是，深度信息的要求禁止更广泛的应用。为了缓解这个问题，本文提出了一种新的方法，名为“对象级别深度重建网络”（Old-NET）仅将RGB图像作为类别级别6D对象姿势估计的输入。我们建议通过将类别级别的形状在对象级深度和规范的NOC表示中直接从单眼RGB图像中直接预测对象级的深度。引入了两个名为归一化的全局位置提示（NGPH）和形状吸引的脱钩深度重建（SDDR）模块的模块，以学习高保真对象级深度和精致的形状表示。最后，通过将预测的规范表示与背面预测的对象级深度对齐来解决6D对象姿势。在具有挑战性的Camera25和Real275数据集上进行了广泛的实验表明，我们的模型虽然很简单，但可以实现最先进的性能。

Recently, RGBD-based category-level 6D object pose estimation has achieved promising improvement in performance, however, the requirement of depth information prohibits broader applications. In order to relieve this problem, this paper proposes a novel approach named Object Level Depth reconstruction Network (OLD-Net) taking only RGB images as input for category-level 6D object pose estimation. We propose to directly predict object-level depth from a monocular RGB image by deforming the category-level shape prior into object-level depth and the canonical NOCS representation. Two novel modules named Normalized Global Position Hints (NGPH) and Shape-aware Decoupled Depth Reconstruction (SDDR) module are introduced to learn high fidelity object-level depth and delicate shape representations. At last, the 6D object pose is solved by aligning the predicted canonical representation with the back-projected object-level depth. Extensive experiments on the challenging CAMERA25 and REAL275 datasets indicate that our model, though simple, achieves state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题