用于域概括的自截至的视觉变压器

论文标题

用于域概括的自截至的视觉变压器

Self-Distilled Vision Transformer for Domain Generalization

论文作者

Sultana, Maryam, Naseer, Muzammal, Khan, Muhammad Haris, Khan, Salman, Khan, Fahad Shahbaz

论文摘要

在最近的过去，已经提出了几种领域的概括（DG）方法，显示出令人鼓舞的性能，但是，几乎所有的都基于卷积神经网络（CNN）。研究视觉变压器（VIT）的DG性能（VIT）几乎没有进展，这挑战了CNN在标准基准测试基准上的至高无上，通常是基于I.I.D假设。这使VITS的现实部署令人怀疑。在本文中，我们试图探索解决DG问题的VIT。与CNN类似，VIT在分发场景中也挣扎，主要罪魁祸首过于适合来源域。受VIT的模块化体系结构的启发，我们为VIT提出了一种简单的DG方法，即VIT的自我介绍。它通过策划中间变压器块的非零熵监管信号来降低源域的过度拟合来减少输入输出映射问题的过度拟合。此外，它不会引入任何新参数，并且可以无缝地插入不同VIT的模块化组成中。我们在五个具有挑战性的数据集中以不同的DG基准和各种VIT骨架表现出显着的性能提高。此外，我们报告了针对最近最新的DG方法的有利性能。我们的代码以及预培训的模型可在以下网址公开获取：https：//github.com/maryam089/sdvit。

In the recent past, several domain generalization (DG) methods have been proposed, showing encouraging performance, however, almost all of them build on convolutional neural networks (CNNs). There is little to no progress on studying the DG performance of vision transformers (ViTs), which are challenging the supremacy of CNNs on standard benchmarks, often built on i.i.d assumption. This renders the real-world deployment of ViTs doubtful. In this paper, we attempt to explore ViTs towards addressing the DG problem. Similar to CNNs, ViTs also struggle in out-of-distribution scenarios and the main culprit is overfitting to source domains. Inspired by the modular architecture of ViTs, we propose a simple DG approach for ViTs, coined as self-distillation for ViTs. It reduces the overfitting of source domains by easing the learning of input-output mapping problem through curating non-zero entropy supervisory signals for intermediate transformer blocks. Further, it does not introduce any new parameters and can be seamlessly plugged into the modular composition of different ViTs. We empirically demonstrate notable performance gains with different DG baselines and various ViT backbones in five challenging datasets. Moreover, we report favorable performance against recent state-of-the-art DG methods. Our code along with pre-trained models are publicly available at: https://github.com/maryam089/SDViT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题