使用Norm-VAE进行数据扩展，以适应无监督的域名

论文标题

使用Norm-VAE进行数据扩展，以适应无监督的域名

Data Augmentation with norm-VAE for Unsupervised Domain Adaptation

论文作者

Wang, Qian, Meng, Fanlin, Breckon, Toby P.

论文摘要

我们从新的角度解决了图像分类中无监督的域适应性（UDA）问题。与大多数对齐数据分布或学习域不变功能的作品相反，我们直接在没有明确域的适应性的情况下，在高维均匀的特征空间内学习了两个域的统一分类器。为此，我们采用有效的选择性伪标签（SPL）技术来利用目标域中未标记的样品。令人惊讶的是，可以通过在原始特征空间中训练的计算简单分类器（例如，浅层多层感知器）来很好地处理源和目标域之间的数据分布差异。此外，我们提出了一种新颖的生成模型标准vae，以生成目标域的合成特征，以作为增强分类器训练的数据增强策略。几个基准数据集的实验结果表明，伪标志性策略本身可以导致与许多最新方法相当的性能，而在大多数情况下，使用Norm-VAE进行功能增强可以进一步提高性能。结果，我们提出的方法（即Naive-SPL和Norm-Vae-SPL）可以实现新的最先进的性能，在Office-Aloke-Caltech和ImageClef-DA数据集上的平均精度为93.4％和90.4％，以及在数字，Office31和Office 31和Office-home数据集上的平均性能，平均准确度，其平均精度为97.2％，87.6％，87.6％，87.6％和87％和87％。

We address the Unsupervised Domain Adaptation (UDA) problem in image classification from a new perspective. In contrast to most existing works which either align the data distributions or learn domain-invariant features, we directly learn a unified classifier for both domains within a high-dimensional homogeneous feature space without explicit domain adaptation. To this end, we employ the effective Selective Pseudo-Labelling (SPL) techniques to take advantage of the unlabelled samples in the target domain. Surprisingly, data distribution discrepancy across the source and target domains can be well handled by a computationally simple classifier (e.g., a shallow Multi-Layer Perceptron) trained in the original feature space. Besides, we propose a novel generative model norm-VAE to generate synthetic features for the target domain as a data augmentation strategy to enhance classifier training. Experimental results on several benchmark datasets demonstrate the pseudo-labelling strategy itself can lead to comparable performance to many state-of-the-art methods whilst the use of norm-VAE for feature augmentation can further improve the performance in most cases. As a result, our proposed methods (i.e. naive-SPL and norm-VAE-SPL) can achieve new state-of-the-art performance with the average accuracy of 93.4% and 90.4% on Office-Caltech and ImageCLEF-DA datasets, and comparable performance on Digits, Office31 and Office-Home datasets with the average accuracy of 97.2%, 87.6% and 67.9% respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题