从真实到综合和背部：合成培训数据，以了解多人场景

论文标题

从真实到综合和背部：合成培训数据，以了解多人场景

From Real to Synthetic and Back: Synthesizing Training Data for Multi-Person Scene Understanding

论文作者

Kviatkovsky, Igor, Bhonker, Nadav, Medioni, Gerard

论文摘要

我们提出了一种合成在特定情况下互动的自然图像的自然图像的方法。这些图像受益于合成数据的优势：完全可以控制并完全注释任何类型的标准或定义的地面真相。 To reduce the synthetic-to-real domain gap, we introduce a pipeline consisting of the following steps: 1) we render scenes in a context modeled after the real world, 2) we train a human parsing model on the synthetic images, 3) we use the model to estimate segmentation maps for real images, 4) we train a conditional generative adversarial network (cGAN) to learn the inverse mapping -- from a segmentation map to a real image, and 5）鉴于新的合成分割图，我们使用CGAN生成逼真的图像。我们的管道的说明如图2所示。我们使用生成的数据来培训有关紫外线映射和密度深度估计的具有挑战性的多任务模型。我们在CMU Panoptic数据集上进行了定量和定性的数据生成的价值和受过训练的模型的价值。

We present a method for synthesizing naturally looking images of multiple people interacting in a specific scenario. These images benefit from the advantages of synthetic data: being fully controllable and fully annotated with any type of standard or custom-defined ground truth. To reduce the synthetic-to-real domain gap, we introduce a pipeline consisting of the following steps: 1) we render scenes in a context modeled after the real world, 2) we train a human parsing model on the synthetic images, 3) we use the model to estimate segmentation maps for real images, 4) we train a conditional generative adversarial network (cGAN) to learn the inverse mapping -- from a segmentation map to a real image, and 5) given new synthetic segmentation maps, we use the cGAN to generate realistic images. An illustration of our pipeline is presented in Figure 2. We use the generated data to train a multi-task model on the challenging tasks of UV mapping and dense depth estimation. We demonstrate the value of the data generation and the trained model, both quantitatively and qualitatively on the CMU Panoptic Dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题