DUAT：医疗图像分割的双聚集变压器网络

论文标题

DUAT：医疗图像分割的双聚集变压器网络

DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation

论文作者

Tang, Feilong, Huang, Qiming, Wang, Jinfeng, Hou, Xianxu, Su, Jionglong, Liu, Jingxin

论文摘要

通过对远程依赖性建模并捕获全球表示形式，基于变压器的模型已被广泛证明在计算机视觉任务中取得了成功。但是，它们通常由大模式的特征主导，导致局部细节的丢失（例如边界和小物体），这在医学图像分割中至关重要。为了减轻这个问题，我们提出了一个称为DUAT的双聚集变压器网络，该网络的特征是两个创新的设计，即全球到本地空间聚集（GLSA）和选择性边界聚集（SBA）模块。 GLSA具有汇总和代表全球和局部空间特征的能力，这些特征分别有益于定位大小物体。 SBA模块用于从低级特征和来自高级特征的语义信息中汇总边界特征，以更好地保留边界细节并找到重新校准对象。在六个基准数据集中进行的广泛实验表明，我们提出的模型在皮肤病变图像分割中的最先进方法和结肠镜检查中的息肉。此外，在各种具有挑战性的情况下，例如小物体分割和模棱两可的对象边界，我们的方法比现有方法更强大。

Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modelling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module is used to aggregate the boundary characteristic from low-level features and semantic information from high-level features for better preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images, and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations such as small object segmentation and ambiguous object boundaries.

下载PDF全文

下载文献需遵守相关版权规定

论文标题