视觉变压器的仿真驱动训练X射线图像中的金属分割

论文标题

视觉变压器的仿真驱动训练X射线图像中的金属分割

Simulation-Driven Training of Vision Transformers Enabling Metal Segmentation in X-Ray Images

论文作者

Fan, Fuxin, Ritschl, Ludwig, Beister, Marcel, Biniazan, Ramyar, Kreher, Björn, Gottschalk, Tristan M., Kappler, Steffen, Maier, Andreas

论文摘要

在X射线射线照相的几种图像采集和处理步骤中，对金属植入物的存在及其确切位置的了解是非常有益的（例如，剂量调节，图像对比度调整）。将从准确的金属分割中受益的另一种应用是基于2D X射线投影的锥束计算断层扫描（CBCT）。由于金属的衰减很大，因此在3D X射线采集中发生了严重的伪影。 CBCT投影中的金属分割通常是避免金属伪像和还原算法的先决条件。由于高质量临床训练的产生是一个持续的挑战，因此本研究建议基于与自设计的计算机辅助设计（CAD）植入物生成模拟的X射线图像，并利用卷积神经网络（CNN）和视觉变压器（VIT）进行金属序列。模型测试是对从样品扫描获得的精确标记的X射线测试数据集执行的。基于CNN编码器的网络（例如U-NET）在尸体测试数据上的性能有限，平均骰子得分低于0.30，而具有双解码器（MST-DD）的金属分割变压器在分割任务上显示出较高的鲁棒性和概括性，平均骰子得分为0.90。我们的研究表明，基于CAD模型的数据生成具有很高的灵活性，并且可能是克服临床数据采样和标签短缺问题的一种方法。此外，在模拟数据培训的情况下，MST-DD方法会产生更可靠的神经网络。

In several image acquisition and processing steps of X-ray radiography, knowledge of the existence of metal implants and their exact position is highly beneficial (e.g. dose regulation, image contrast adjustment). Another application which would benefit from an accurate metal segmentation is cone beam computed tomography (CBCT) which is based on 2D X-ray projections. Due to the high attenuation of metals, severe artifacts occur in the 3D X-ray acquisitions. The metal segmentation in CBCT projections usually serves as a prerequisite for metal artifact avoidance and reduction algorithms. Since the generation of high quality clinical training is a constant challenge, this study proposes to generate simulated X-ray images based on CT data sets combined with self-designed computer aided design (CAD) implants and make use of convolutional neural network (CNN) and vision transformer (ViT) for metal segmentation. Model test is performed on accurately labeled X-ray test datasets obtained from specimen scans. The CNN encoder-based network like U-Net has limited performance on cadaver test data with an average dice score below 0.30, while the metal segmentation transformer with dual decoder (MST-DD) shows high robustness and generalization on the segmentation task, with an average dice score of 0.90. Our study indicates that the CAD model-based data generation has high flexibility and could be a way to overcome the problem of shortage in clinical data sampling and labelling. Furthermore, the MST-DD approach generates a more reliable neural network in case of training on simulated data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题