论文标题

使用Norm-VAE进行数据扩展,以适应无监督的域名

Data Augmentation with norm-VAE for Unsupervised Domain Adaptation

论文作者

Wang, Qian, Meng, Fanlin, Breckon, Toby P.

论文摘要

我们从新的角度解决了图像分类中无监督的域适应性(UDA)问题。与大多数对齐数据分布或学习域不变功能的作品相反,我们直接在没有明确域的适应性的情况下,在高维均匀的特征空间内学习了两个域的统一分类器。为此,我们采用有效的选择性伪标签(SPL)技术来利用目标域中未标记的样品。令人惊讶的是,可以通过在原始特征空间中训练的计算简单分类器(例如,浅层多层感知器)来很好地处理源和目标域之间的数据分布差异。此外,我们提出了一种新颖的生成模型标准vae,以生成目标域的合成特征,以作为增强分类器训练的数据增强策略。几个基准数据集的实验结果表明,伪标志性策略本身可以导致与许多最新方法相当的性能,而在大多数情况下,使用Norm-VAE进行功能增强可以进一步提高性能。结果,我们提出的方法(即Naive-SPL和Norm-Vae-SPL)可以实现新的最先进的性能,在Office-Aloke-Caltech和ImageClef-DA数据集上的平均精度为93.4%和90.4%,以及在数字,Office31和Office 31和Office-home数据集上的平均性能,平均准确度,其平均精度为97.2%,87.6%,87.6%,87.6%和87%和87%。

We address the Unsupervised Domain Adaptation (UDA) problem in image classification from a new perspective. In contrast to most existing works which either align the data distributions or learn domain-invariant features, we directly learn a unified classifier for both domains within a high-dimensional homogeneous feature space without explicit domain adaptation. To this end, we employ the effective Selective Pseudo-Labelling (SPL) techniques to take advantage of the unlabelled samples in the target domain. Surprisingly, data distribution discrepancy across the source and target domains can be well handled by a computationally simple classifier (e.g., a shallow Multi-Layer Perceptron) trained in the original feature space. Besides, we propose a novel generative model norm-VAE to generate synthetic features for the target domain as a data augmentation strategy to enhance classifier training. Experimental results on several benchmark datasets demonstrate the pseudo-labelling strategy itself can lead to comparable performance to many state-of-the-art methods whilst the use of norm-VAE for feature augmentation can further improve the performance in most cases. As a result, our proposed methods (i.e. naive-SPL and norm-VAE-SPL) can achieve new state-of-the-art performance with the average accuracy of 93.4% and 90.4% on Office-Caltech and ImageCLEF-DA datasets, and comparable performance on Digits, Office31 and Office-Home datasets with the average accuracy of 97.2%, 87.6% and 67.9% respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源