域感知的视觉偏见消除了广义零射门学习

论文标题

域感知的视觉偏见消除了广义零射门学习

Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning

论文作者

Min, Shaobo, Yao, Hantao, Xie, Hongtao, Wang, Chaoqun, Zha, Zheng-Jun, Zhang, Yongdong

论文摘要

最近的方法着重于学习一个统一的语义视觉表示，以在两个领域之间传递知识，同时忽略了无语言视觉表示在减轻偏见的识别问题方面的影响。在本文中，我们提出了一种新颖的域感知视觉偏见（DVBE）网络，该网络构建了两个互补的视觉表示，即无语义和语义对准，以分别处理和看不见的域。具体而言，我们探索了跨强度的二阶视觉统计数据，以压缩无语音表示，并设计自适应余量软智能，以最大程度地提高阶层间差异。因此，不仅可以准确预测所见类，而且还基于预测的类熵，不仅可以准确预测看不见的图像，即域检测。对于看不见的图像，我们会自动搜索最佳的语义 - 视觉对齐结构，而不是手动设计，以预测看不见的类。通过准确的域检测，对可见域的偏置识别问题大大减少了。对分类和分割的五个基准测试的实验表明，DVBE的表现比现有方法的平均提高了5.7％。

Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating the biased recognition problem. In this paper, we propose a novel Domain-aware Visual Bias Eliminating (DVBE) network that constructs two complementary visual representations, i.e., semantic-free and semantic-aligned, to treat seen and unseen domains separately. Specifically, we explore cross-attentive second-order visual statistics to compact the semantic-free representation, and design an adaptive margin Softmax to maximize inter-class divergences. Thus, the semantic-free representation becomes discriminative enough to not only predict seen class accurately but also filter out unseen images, i.e., domain detection, based on the predicted class entropy. For unseen images, we automatically search an optimal semantic-visual alignment architecture, rather than manual designs, to predict unseen classes. With accurate domain detection, the biased recognition problem towards the seen domain is significantly reduced. Experiments on five benchmarks for classification and segmentation show that DVBE outperforms existing methods by averaged 5.7% improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题