通过多任务学习和位置损失改善机器人抓握单眼图像

论文标题

通过多任务学习和位置损失改善机器人抓握单眼图像

Improving Robotic Grasping on Monocular Images Via Multi-Task Learning and Positional Loss

论文作者

Prew, William, Breckon, Toby, Bordewich, Magnus, Beierholm, Ulrik

论文摘要

在本文中，我们介绍了两种改进端到端CNN体系结构中单眼颜色图像的实时对象性能的方法。首先是在模型培训（多任务学习）中增加辅助任务。在执行补充深度重建任务时，我们的多任务CNN模型从大型jacquard握把数据集中的基线平均值提高了72.04％，提高到78.14％。第二个是引入一个位置损失函数，该函数强调仅在可以进行成功掌握的对象的点上，仅在对象的点上，每像素的次级参数（抓地角和宽度）仅在次级参数（抓地力角度和宽度）上。这将基线平均水平从72.04％增加到78.92％，并减少所需训练时期的数量。这些方法也可以同时执行，从而导致进一步的性能提高到79.12％，同时保持足够的推理速度以实时掌握处理。

In this paper, we introduce two methods of improving real-time object grasping performance from monocular colour images in an end-to-end CNN architecture. The first is the addition of an auxiliary task during model training (multi-task learning). Our multi-task CNN model improves grasping performance from a baseline average of 72.04% to 78.14% on the large Jacquard grasping dataset when performing a supplementary depth reconstruction task. The second is introducing a positional loss function that emphasises loss per pixel for secondary parameters (gripper angle and width) only on points of an object where a successful grasp can take place. This increases performance from a baseline average of 72.04% to 78.92% as well as reducing the number of training epochs required. These methods can be also performed in tandem resulting in a further performance increase to 79.12% while maintaining sufficient inference speed to afford real-time grasp processing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题