意大利面：对SYN到现实域概括的比例振幅频谱训练增强

论文标题

意大利面：对SYN到现实域概括的比例振幅频谱训练增强

PASTA: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization

论文作者

Chattopadhyay, Prithvijit, Sarangmath, Kartik, Vijaykumar, Vivek, Hoffman, Judy

论文摘要

合成数据提供了廉价和丰富的培训数据的希望，用于在稀缺的现实世界中标记的设置。但是，在对现实世界数据进行评估时，经过合成数据训练的模型显着表现不佳。在本文中，我们提出了比例振幅频谱训练增强（Pasta），这是一种简单有效的增强策略，以改善开箱即用的合成对元值到现实（SYN-OL）的概括性能。意大利面在傅立叶结构域中合成图像的幅度光谱散布，以产生增强视图。具体而言，使用面食，我们提出了一种结构化的扰动策略，在该策略中，高频组件比低频相对较高。对于语义分割的任务（GTAV到真实），对象检测（SIM10K到键）和对象识别（Visda-C Syn-to-eal），在总共5个SYN-eal-eal shifts中，我们发现面食的表现优于更复杂的最新的最新的概括方法，同时是对同一的互补的。

Synthetic data offers the promise of cheap and bountiful training data for settings where labeled real-world data is scarce. However, models trained on synthetic data significantly underperform when evaluated on real-world data. In this paper, we propose Proportional Amplitude Spectrum Training Augmentation (PASTA), a simple and effective augmentation strategy to improve out-of-the-box synthetic-to-real (syn-to-real) generalization performance. PASTA perturbs the amplitude spectra of synthetic images in the Fourier domain to generate augmented views. Specifically, with PASTA we propose a structured perturbation strategy where high-frequency components are perturbed relatively more than the low-frequency ones. For the tasks of semantic segmentation (GTAV-to-Real), object detection (Sim10K-to-Real), and object recognition (VisDA-C Syn-to-Real), across a total of 5 syn-to-real shifts, we find that PASTA outperforms more complex state-of-the-art generalization methods while being complementary to the same.

下载PDF全文

下载文献需遵守相关版权规定

论文标题