通用边缘人的掩盖方案

论文标题

通用边缘人的掩盖方案

Masking schemes for universal marginalisers

论文作者

Gautam, Divya, Lomeli, Maria, Gourgoulias, Kostis, Thompson, Daniel H., Johri, Saurabh

论文摘要

我们考虑训练通用边缘化器（arxiv：1711.00695）时结构 - 静st和结构依赖的掩蔽方案的效果感兴趣的生成模型的所有随机变量的任意子集。换句话说，我们模仿了对denoising自动编码器的自学训练，其中未标记数据的数据集用作部分观察到的输入，并优化了神经近似值，以最大程度地减少重建损失。我们专注于研究部分观察到的数据的基本过程 - 当预测时间的观察过程与培训期间的掩盖过程不同时，神经近似值在学习所有条件分布时的学习情况如何？我们根据其预测性能和泛化属性来比较接受不同掩蔽方案训练的网络。

We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with different masking schemes in terms of their predictive performance and generalisation properties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题