论文标题
HMD- eGopose:基于头部显示的基于显示的egentric标记工具和手工姿势估算,以增强手术指导
HMD-EgoPose: Head-Mounted Display-Based Egocentric Marker-Less Tool and Hand Pose Estimation for Augmented Surgical Guidance
论文作者
论文摘要
现代计算机辅助手术程序的成功或失败取决于轨道仪器和组织的精确六度(6DOF)位置和方向(姿势)估计。在本文中,我们提出了HMD-Egopose,这是一种基于单一学习的方法,可以亲和对象姿势估计,并在基准数据集中证明了单眼红色绿色蓝色(RGB)6DOF标记的无手和外科手术仪器姿势跟踪的最新性能。此外,我们通过低延期流式的方法揭示了在商业可用的光学透明头部安装显示器(OST-HMD)上对HMD eGopose框架进行的HMD- eGopose框架的容量。我们的框架利用有效的卷积神经网络(CNN)主干来进行多尺度特征提取和一组子网,共同学习刚性手术钻机的6DOF姿势表示,以及用户手的握把方向。为了使我们的方法可访问Microsoft Hololens 2的市售OST-HMD,我们创建了一个用于低延迟视频和数据通信的管道,并使用能够优化网络推断的高性能计算工作站。 HMD eGopose在基准数据集上的当前最新方法用于手术工具姿势估计,在实际数据上达到了平均工具3D顶点误差为11.0 mm,并在临床上可行的无标记物跟踪策略方面进步,进一步发展了进度。通过我们的低延迟流方法,当与OST-HMD集成时,我们达到了姿势估算和跟踪模型的可视化的往返潜伏期为199.1 ms。我们的一次性学习方法对遮挡和复杂的表面非常强大,并且在当前无标记工具和手动姿势估计的最新方法上得到了改进。
The success or failure of modern computer-assisted surgery procedures hinges on the precise six-degree-of-freedom (6DoF) position and orientation (pose) estimation of tracked instruments and tissue. In this paper, we present HMD-EgoPose, a single-shot learning-based approach to hand and object pose estimation and demonstrate state-of-the-art performance on a benchmark dataset for monocular red-green-blue (RGB) 6DoF marker-less hand and surgical instrument pose tracking. Further, we reveal the capacity of our HMD-EgoPose framework for performant 6DoF pose estimation on a commercially available optical see-through head-mounted display (OST-HMD) through a low-latency streaming approach. Our framework utilized an efficient convolutional neural network (CNN) backbone for multi-scale feature extraction and a set of subnetworks to jointly learn the 6DoF pose representation of the rigid surgical drill instrument and the grasping orientation of the hand of a user. To make our approach accessible to a commercially available OST-HMD, the Microsoft HoloLens 2, we created a pipeline for low-latency video and data communication with a high-performance computing workstation capable of optimized network inference. HMD-EgoPose outperformed current state-of-the-art approaches on a benchmark dataset for surgical tool pose estimation, achieving an average tool 3D vertex error of 11.0 mm on real data and furthering the progress towards a clinically viable marker-free tracking strategy. Through our low-latency streaming approach, we achieved a round trip latency of 199.1 ms for pose estimation and augmented visualization of the tracked model when integrated with the OST-HMD. Our single-shot learned approach was robust to occlusion and complex surfaces and improved on current state-of-the-art approaches to marker-less tool and hand pose estimation.