在深层生成模型中诊断和修复多种多样的过度拟合

论文标题

在深层生成模型中诊断和修复多种多样的过度拟合

Diagnosing and Fixing Manifold Overfitting in Deep Generative Models

论文作者

Loaiza-Ganem, Gabriel, Ross, Brendan Leigh, Cresswell, Jesse C., Caterini, Anthony L.

论文摘要

基于似然或显式的深层生成模型使用神经网络来构建灵活的高维密度。该公式直接与歧管假设相矛盾，该假设指出，观察到的数据位于嵌入高维环境空间中的低维歧管上。在本文中，我们研究了在这种维度不匹配的情况下，最大可能的训练的病理。我们正式证明，在学习歧管本身而不是分布的情况下，可以实现堕落的优点，而我们称之为多种歧视的现象过于拟合。我们提出了一类两步程序，该过程包括降低性步骤，然后进行最大样子密度估计，并证明他们在非参数方面恢复了数据生成分布，从而避免了多种多样的过度拟合。我们还表明，这些过程可以对隐式模型（例如生成对抗网络）学到的流形进行密度估计，从而解决了这些模型的主要缺点。最近提出的几种方法是我们两步程序的实例。因此，我们统一，扩展和理论上证明了一大批模型。

Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which states that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space. In this paper we investigate the pathologies of maximum-likelihood training in the presence of this dimensionality mismatch. We formally prove that degenerate optima are achieved wherein the manifold itself is learned but not the distribution on it, a phenomenon we call manifold overfitting. We propose a class of two-step procedures consisting of a dimensionality reduction step followed by maximum-likelihood density estimation, and prove that they recover the data-generating distribution in the nonparametric regime, thus avoiding manifold overfitting. We also show that these procedures enable density estimation on the manifolds learned by implicit models, such as generative adversarial networks, hence addressing a major shortcoming of these models. Several recently proposed methods are instances of our two-step procedures; we thus unify, extend, and theoretically justify a large class of models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题