线性潜在变量模型中的因果发现受测量误差

论文标题

线性潜在变量模型中的因果发现受测量误差

Causal Discovery in Linear Latent Variable Models Subject to Measurement Error

论文作者

Yang, Yuqin, Ghassami, AmirEmad, Nafea, Mohamed, Kiyavash, Negar, Zhang, Kun, Shpitser, Ilya

论文摘要

在线性系统中存在测量误差的情况下，我们关注因果发现，其中混合矩阵，即指示与观察到的变量有关的独立外源性噪声项的矩阵，均鉴定为列的置换和列的缩放。在存在未观察到的无父育原因的情况下，我们证明了这个问题与因果发现之间的一种令人惊讶的联系，因为在这些问题中要推断出的基本模型之间存在映射，由混合矩阵给出。因此，基于一个模型的混合矩阵的任何可识别性结果都转化为另一个模型的可识别性结果。我们表征在何种程度上可以在两部分的忠实假设下确定因果模型。仅在假设的第一部分（与忠实的常规定义相对应），可以将结构学习到变量的有序分组之间的因果顺序，但不能识别各组之间的所有边缘。我们进一步表明，如果实施了忠实的两个部分，则可以将结构学习到更精致的有序分组。由于这种改进，对于具有未观察到的无父性原因的潜在变量模型，可以识别结构。基于我们的理论结果，我们为这两种模型提出了因果结构学习方法，并评估其在合成数据上的性能。

We focus on causal discovery in the presence of measurement error in linear systems where the mixing matrix, i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables, is identified up to permutation and scaling of the columns. We demonstrate a somewhat surprising connection between this problem and causal discovery in the presence of unobserved parentless causes, in the sense that there is a mapping, given by the mixing matrix, between the underlying models to be inferred in these problems. Consequently, any identifiability result based on the mixing matrix for one model translates to an identifiability result for the other model. We characterize to what extent the causal models can be identified under a two-part faithfulness assumption. Under only the first part of the assumption (corresponding to the conventional definition of faithfulness), the structure can be learned up to the causal ordering among an ordered grouping of the variables but not all the edges across the groups can be identified. We further show that if both parts of the faithfulness assumption are imposed, the structure can be learned up to a more refined ordered grouping. As a result of this refinement, for the latent variable model with unobserved parentless causes, the structure can be identified. Based on our theoretical results, we propose causal structure learning methods for both models, and evaluate their performance on synthetic data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题