ISDA：位置感知实例分割，并具有可变形的注意力

论文标题

ISDA：位置感知实例分割，并具有可变形的注意力

ISDA: Position-Aware Instance Segmentation with Deformable Attention

论文作者

Ying, Kaining, Wang, Zhenhua, Bai, Cong, Zhou, Pengfei

论文摘要

大多数实例分割模型由于提案估计（RPN）作为预处理或非最大抑制（NMS）作为后处理而无法端到端训练。在这里，我们提出了一种称为ISDA的新型端到端实例分割方法。它重塑了任务以预测一组对象掩模，这些对象掩模是通过传统的卷积操作生成的，具有学识渊博的位置感知的内核和对象的特征。通过利用具有多尺度表示的可变形注意网络来学习此类内核和功能。多亏了引入的设定预测机制，提出的方法是无NMS的。从经验上讲，ISDA的表现优于MS-Coco上的蒙版R-CNN（强基线）2.6分，并且与最近的模型相比，实现了领先的性能。代码将很快可用。

Most instance segmentation models are not end-to-end trainable due to either the incorporation of proposal estimation (RPN) as a pre-processing or non-maximum suppression (NMS) as a post-processing. Here we propose a novel end-to-end instance segmentation method termed ISDA. It reshapes the task into predicting a set of object masks, which are generated via traditional convolution operation with learned position-aware kernels and features of objects. Such kernels and features are learned by leveraging a deformable attention network with multi-scale representation. Thanks to the introduced set-prediction mechanism, the proposed method is NMS-free. Empirically, ISDA outperforms Mask R-CNN (the strong baseline) by 2.6 points on MS-COCO, and achieves leading performance compared with recent models. Code will be available soon.

下载PDF全文

下载文献需遵守相关版权规定

论文标题