信息理论通过多维高斯化测量

论文标题

信息理论通过多维高斯化测量

Information Theory Measures via Multidimensional Gaussianization

论文作者

Laparra, Valero, Johnson, J. Emmanuel, Camps-Valls, Gustau, Santos-Rodríguez, Raul, Malo, Jesus

论文摘要

信息理论是衡量数据和系统中不确定性，依赖性和相关性的杰出框架。它具有用于现实世界应用的几种理想属性：它自然处理多元数据，可以处理异质数据类型，并且可以用物理单位来解释措施。但是，由于多维数据的诅咒，从多维数据中获取信息是一个具有挑战性的问题，因此尚未被更广泛的受众采用。在这里，我们提出了一种基于多元高斯变换的间接计算信息的方式。我们的建议通过将其减少到可拖动（边际）操作和简单的线性变换的组成来减轻多元密度估计的难度，这些线性转换可以解释为特定的深神经网络。我们介绍了基于高斯化的特定方法，以估计总相关性，熵，相互信息和kullback-leibler差异。我们将它们与最新的估计器进行比较，显示了从不同的多元分布生成的合成数据的准确性。我们公开使用工具和数据集来提供测试床以分析未来的方法。结果表明，我们的建议优于以前的估计量，尤其是在高维情况下。并且它导致了关于神经科学，地球科学，计算机视觉和机器学习的有趣见解。

Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications: it naturally deals with multivariate data, it can handle heterogeneous data types, and the measures can be interpreted in physical units. However, it has not been adopted by a wider audience because obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality. Here we propose an indirect way of computing information based on a multivariate Gaussianization transform. Our proposal mitigates the difficulty of multivariate density estimation by reducing it to a composition of tractable (marginal) operations and simple linear transformations, which can be interpreted as a particular deep neural network. We introduce specific Gaussianization-based methodologies to estimate total correlation, entropy, mutual information and Kullback-Leibler divergence. We compare them to recent estimators showing the accuracy on synthetic data generated from different multivariate distributions. We made the tools and datasets publicly available to provide a test-bed to analyze future methodologies. Results show that our proposal is superior to previous estimators particularly in high-dimensional scenarios; and that it leads to interesting insights in neuroscience, geoscience, computer vision, and machine learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题