论文标题
参考 - 不变逆协方差估计,并应用于微生物网络恢复
Reference-Invariant Inverse Covariance Estimation with Application to Microbial Network Recovery
论文作者
论文摘要
微生物组数据中微生物分类单元之间的相互作用一直在科学界的研究兴趣。特别是,已经提出了几种方法,例如SpieC-Easi,GCODA和CD-Trace来对微生物分类群之间的条件依赖性进行建模,以消除检测虚假相关性。但是,所有这些方法都是基于中央对数(CLR)变换的,这导致了退化的协方差矩阵,因此作为基础网络的估计,不确定的逆协方差矩阵。江等。 (2021)和Tian等。 (2022)提出了基于添加剂记录比例(ALR)转换的偏置校正图形套索和组成图形套索,该转换首先选择参考分类单元,然后计算所有其他分类群体相对于参考的所有其他分类群的对数比。 ALR转换的一个关注点是相对于参考选择的估计网络的不变性。在本文中,我们首先基于ALR转换的数据建立了感兴趣的子网的参考不变属性。然后,我们通过修改其目标函数中的惩罚,仅对不变子网络进行惩罚,提出了组成图形套索的参考不变版本。我们在各种模拟方案下以及通过应用于海洋微生物组数据集的应用程序验证了所提出的方法的参考变量属性。
The interactions between microbial taxa in microbiome data has been under great research interest in the science community. In particular, several methods such as SPIEC-EASI, gCoda, and CD-trace have been proposed to model the conditional dependency between microbial taxa, in order to eliminate the detection of spurious correlations. However, all those methods are built upon the central log-ratio (CLR) transformation, which results in a degenerate covariance matrix and thus an undefined inverse covariance matrix as the estimation of the underlying network. Jiang et al. (2021) and Tian et al. (2022) proposed bias-corrected graphical lasso and compositional graphical lasso based on the additive log-ratio (ALR) transformation, which first selects a reference taxon and then computes the log ratios of the abundances of all the other taxa with respect to that of the reference. One concern of the ALR transformation would be the invariance of the estimated network with respect to the choice of reference. In this paper, we first establish the reference-invariance property of a subnetwork of interest based on the ALR transformed data. Then, we propose a reference-invariant version of the compositional graphical lasso by modifying the penalty in its objective function, penalizing only the invariant subnetwork. We validate the reference-invariance property of the proposed method under a variety of simulation scenarios as well as through the application to an oceanic microbiome data set.