Max-Deeplab：带有蒙版变形金刚的端到端全景分割

论文标题

Max-Deeplab：带有蒙版变形金刚的端到端全景分割

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

论文作者

Wang, Huiyu, Zhu, Yukun, Adam, Hartwig, Yuille, Alan, Chen, Liang-Chieh

论文摘要

我们提出了最大 - 清除板，这是第一个用于泛型分割的端到端模型。我们的方法简化了当前的管道，这些管道在很大程度上取决于替代子任务和手工设计的组件，例如盒子检测，非最大抑制作用，东西合并等等。相比之下，我们的Max-DeepLab直接预测具有掩码变压器的类标签面具，并通过双方匹配受到全面质量启发的损失训练。我们的面具变压器采用双路径体系结构，除了CNN路径外，还引入了全局内存路径，从而可以直接与任何CNN层进行通信。结果，Max-DeepLab在具有挑战性的可可数据集上显示出无框的智慧状态的7.1％PQ增益，首次缩小了基于盒子和无盒子的方法之间的差距。具有相似参数和M-ADDS的DETR比DEDR提高了3.0％的PQ。此外，Max-DeepLab在没有测试时间增加的情况下，可以在可可测试-DEV集中实现最新的51.3％PQ。代码可在https://github.com/google-research/deeplab2上找到。

We present MaX-DeepLab, the first end-to-end model for panoptic segmentation. Our approach simplifies the current pipeline that depends heavily on surrogate sub-tasks and hand-designed components, such as box detection, non-maximum suppression, thing-stuff merging, etc. Although these sub-tasks are tackled by area experts, they fail to comprehensively solve the target task. By contrast, our MaX-DeepLab directly predicts class-labeled masks with a mask transformer, and is trained with a panoptic quality inspired loss via bipartite matching. Our mask transformer employs a dual-path architecture that introduces a global memory path in addition to a CNN path, allowing direct communication with any CNN layers. As a result, MaX-DeepLab shows a significant 7.1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time. A small variant of MaX-DeepLab improves 3.0% PQ over DETR with similar parameters and M-Adds. Furthermore, MaX-DeepLab, without test time augmentation, achieves new state-of-the-art 51.3% PQ on COCO test-dev set. Code is available at https://github.com/google-research/deeplab2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题