论文标题
多对象单眼大满贯,以进行动态环境
Multi-object Monocular SLAM for Dynamic Environments
论文作者
论文摘要
在本文中,我们解决了单眼相机的多体大满贯问题。多体一词意味着我们跟踪相机的运动以及现场其他动态参与者的运动。动态场景中的典型挑战是不可观察的:不可能从移动的单眼相机中明确地将移动对象进行三角剖分。现有方法解决了问题的限制变体,但是解决方案却遭受了相对规模的歧义(即,对于现场中的每对动作,存在一个无限的解决方案家族)。我们通过利用单视图,深度学习的进步和类别级别的估计来解决这个相当棘手的问题。我们提出了一种多姿势图片优化公式,以解决所涉及的相对和绝对规模因子的歧义。这种优化有助于我们在实际数据集(例如Kitti)上减少多个物体轨迹的平均误差。据我们所知,我们的方法是第一个实用的单眼多体猛击系统,该系统是在统一框架中以度量标准进行动态多对象和自我定位。
In this paper, we tackle the problem of multibody SLAM from a monocular camera. The term multibody, implies that we track the motion of the camera, as well as that of other dynamic participants in the scene. The quintessential challenge in dynamic scenes is unobservability: it is not possible to unambiguously triangulate a moving object from a moving monocular camera. Existing approaches solve restricted variants of the problem, but the solutions suffer relative scale ambiguity (i.e., a family of infinitely many solutions exist for each pair of motions in the scene). We solve this rather intractable problem by leveraging single-view metrology, advances in deep learning, and category-level shape estimation. We propose a multi pose-graph optimization formulation, to resolve the relative and absolute scale factor ambiguities involved. This optimization helps us reduce the average error in trajectories of multiple bodies over real-world datasets, such as KITTI. To the best of our knowledge, our method is the first practical monocular multi-body SLAM system to perform dynamic multi-object and ego localization in a unified framework in metric scale.