论文标题
端到端学习通过对象点云的采样来掌握
End-to-End Learning to Grasp via Sampling from Object Point Clouds
论文作者
论文摘要
掌握对象的能力是一项基本技能,可以实现许多机器人操纵任务。最近的作品研究了基于点云的方法,从模拟数据集开始,并在现实世界中显示出有希望的性能。然而,他们中的许多人仍然依靠临时几何启发式方法来产生掌握候选者,这些候选者无法推广到相对于训练期间观察到的物体明显不同的对象。几种方法利用复杂的多阶段学习策略和本地社区特征提取,同时忽略语义全球信息。此外,它们在培训样本的数量和推理所需的时间方面效率低下。在本文中,我们提出了一种端到端学习解决方案,以从对象的3D部分视图开始生成6-DOF并行的grasps。我们的学习掌握(L2G)方法通过一个新的过程从输入点云中收集信息,该过程结合了一个可区分的采样策略,以识别可见的接触点以及利用本地和全局提示的功能编码器。总体而言,L2G以多任务目标为指导,该目标通过优化接触点采样,GRASP回归和GRASP分类来产生各种grasps。通过彻底的实验分析,我们显示了L2G及其稳健性和概括能力的有效性。
The ability to grasp objects is an essential skill that enables many robotic manipulation tasks. Recent works have studied point cloud-based methods for object grasping by starting from simulated datasets and have shown promising performance in real-world scenarios. Nevertheless, many of them still rely on ad-hoc geometric heuristics to generate grasp candidates, which fail to generalize to objects with significantly different shapes with respect to those observed during training. Several approaches exploit complex multi-stage learning strategies and local neighborhood feature extraction while ignoring semantic global information. Furthermore, they are inefficient in terms of number of training samples and time required for inference. In this paper, we propose an end-to-end learning solution to generate 6-DOF parallel-jaw grasps starting from the 3D partial view of the object. Our Learning to Grasp (L2G) method gathers information from the input point cloud through a new procedure that combines a differentiable sampling strategy to identify the visible contact points, with a feature encoder that leverages local and global cues. Overall, L2G is guided by a multi-task objective that generates a diverse set of grasps by optimizing contact point sampling, grasp regression, and grasp classification. With a thorough experimental analysis, we show the effectiveness of L2G as well as its robustness and generalization abilities.