论文标题
根据合成训练数据,用RGB摄像机估算欧元托盘的姿势
Estimating the Pose of a Euro Pallet with an RGB Camera based on Synthetic Training Data
论文作者
论文摘要
估计托盘和其他物流对象的姿势对于各种用例,例如自动化的材料处理或跟踪至关重要。计算机视觉,计算能力和机器学习的创新为基于相机和神经网络的无设置的新机会开辟了新的机会。培训网络需要带注释姿势的大图像数据集。手动注释,尤其是6D姿势的注释,是一个极其劳动密集型的过程。因此,较新的方法通常利用合成训练数据自动化产生带注释的图像数据集的过程。在这项工作中,提出了6D姿势估计托盘的合成训练数据的生成。然后,数据用于训练深对象姿势估计(DOPE)算法。该算法的实验验证证明,具有红色绿色蓝色(RGB)相机的标准欧元托盘的6D姿势估计是可行的。在不同的照明条件下的三个不同数据集的结果比较显示了适当的数据集设计的相关性,以实现准确稳健的本地化。定量评估显示,首选数据集的平均位置误差小于20 cm。公开提供了经过验证的培训数据集和欧元托盘的感性模型。
Estimating the pose of a pallet and other logistics objects is crucial for various use cases, such as automatized material handling or tracking. Innovations in computer vision, computing power, and machine learning open up new opportunities for device-free localization based on cameras and neural networks. Large image datasets with annotated poses are required for training the network. Manual annotation, especially of 6D poses, is an extremely labor-intensive process. Hence, newer approaches often leverage synthetic training data to automatize the process of generating annotated image datasets. In this work, the generation of synthetic training data for 6D pose estimation of pallets is presented. The data is then used to train the Deep Object Pose Estimation (DOPE) algorithm. The experimental validation of the algorithm proves that the 6D pose estimation of a standardized Euro pallet with a Red-Green-Blue (RGB) camera is feasible. The comparison of the results from three varying datasets under different lighting conditions shows the relevance of an appropriate dataset design to achieve an accurate and robust localization. The quantitative evaluation shows an average position error of less than 20 cm for the preferred dataset. The validated training dataset and a photorealistic model of a Euro pallet are publicly provided.