partafford：从3D对象发现零件级别的负担能力发现

论文标题

partafford：从3D对象发现零件级别的负担能力发现

PartAfford: Part-level Affordance Discovery from 3D Objects

论文作者

Xu, Chao, Chen, Yixin, Wang, He, Zhu, Song-Chun, Zhu, Yixin, Huang, Siyuan

论文摘要

了解哪些物体可以为人类提供服务，学习对象负担得起 - 是弥合感知和行动的关键。在视觉社区中，先前的工作主要专注于以密集（例如，在像素级别）的监督下学习对象负担。与之形成鲜明对比的是，我们人类学会了无密集标签的物体负担。因此，设计一个计算模型的基本问题是：从视觉外观和几何形状和人类稀疏的稀疏监督中学习对象的自然方法是什么？在这项工作中，我们提出了零件级别负担得起的发现（Partafford）的新任务：仅鉴于每个对象的负担标签，该机器的任务是（i）将3D形状分解为零件，并且（ii）发现对象的每个部分如何与某个负担能力类别相对应。我们为Partafford提出了一个新颖的学习框架，该框架仅利用负担能设定的监督和几何原始正规化，发现零件级的表示，而无需密切的监督。所提出的方法由两个主要组成部分组成：（i）针对无监督聚类和抽象的插槽的抽象编码器，以及（ii）带有分支的负担得分解码器，用于部分重建，负担能力预测和立方体原始原始正则化。为了学习和评估Partafford，我们构建了一个零件级别的跨类别3D对象负担能力数据集，并注释了24个> 25，000个对象中共享的24个负担能力类别。我们证明我们的方法既可以启用3D对象的抽象和零件级别的负担能力发现，并且可以推广到困难和跨类别的示例。进一步的消融揭示了每个组件的贡献。

Understanding what objects could furnish for humans-namely, learning object affordance-is the crux to bridge perception and action. In the vision community, prior work primarily focuses on learning object affordance with dense (e.g., at a per-pixel level) supervision. In stark contrast, we humans learn the object affordance without dense labels. As such, the fundamental question to devise a computational model is: What is the natural way to learn the object affordance from visual appearance and geometry with humanlike sparse supervision? In this work, we present a new task of part-level affordance discovery (PartAfford): Given only the affordance labels per object, the machine is tasked to (i) decompose 3D shapes into parts and (ii) discover how each part of the object corresponds to a certain affordance category. We propose a novel learning framework for PartAfford, which discovers part-level representations by leveraging only the affordance set supervision and geometric primitive regularization, without dense supervision. The proposed approach consists of two main components: (i) an abstraction encoder with slot attention for unsupervised clustering and abstraction, and (ii) an affordance decoder with branches for part reconstruction, affordance prediction, and cuboidal primitive regularization. To learn and evaluate PartAfford, we construct a part-level, cross-category 3D object affordance dataset, annotated with 24 affordance categories shared among >25, 000 objects. We demonstrate that our method enables both the abstraction of 3D objects and part-level affordance discovery, with generalizability to difficult and cross-category examples. Further ablations reveal the contribution of each component.

下载PDF全文

下载文献需遵守相关版权规定

论文标题