在部分观察到的生物分子网络中的因果查询估计的实验设计

论文标题

在部分观察到的生物分子网络中的因果查询估计的实验设计

Experimental design for causal query estimation in partially observed biomolecular networks

论文作者

Mohammad-Taheri, Sara, Tewari, Vartika, Kapre, Rohan, Rahiminasab, Ehsan, Sachs, Karen, Hoyt, Charles Tapley, Zucker, Jeremy, Vitek, Olga

论文摘要

从观察数据估算因果查询是分析生物分子网络的重要任务。估计将网络拓扑，查询估计方法和网络变量上的观察测量值视为输入。但是，涉及许多变量的估计在实验上可能是昂贵的，并且在计算上棘手。此外，使用完整的变量可能是有害的，导致偏见或增加估计的差异。因此，设计基于精心挑选的网络组件子集的实验可以提高估计准确性，并降低实验和计算成本。我们提出了一种基于仿真的算法，用于选择子网络，该子网络在成本的限制下支持因果查询的无偏估计量，相对于估计器的差异进行了排名。模拟是根据历史实验数据或基于生物系统的已知特性构建的。三个案例研究表明，精心挑选的网络子集在观察数据中估算因果问题的有效性。所有案例研究都是可重现的，可在https://github.com/srtaheri/simplified_lvm上获得。

Estimating a causal query from observational data is an essential task in the analysis of biomolecular networks. Estimation takes as input a network topology, a query estimation method, and observational measurements on the network variables. However, estimations involving many variables can be experimentally expensive, and computationally intractable. Moreover, using the full set of variables can be detrimental, leading to bias, or increasing the variance in the estimation. Therefore, designing an experiment based on a well-chosen subset of network components can increase estimation accuracy, and reduce experimental and computational costs. We propose a simulation-based algorithm for selecting sub-networks that support unbiased estimators of the causal query under a constraint of cost, ranked with respect to the variance of the estimators. The simulations are constructed based on historical experimental data, or based on known properties of the biological system. Three case studies demonstrated the effectiveness of well-chosen network subsets for estimating causal queries from observational data. All the case studies are reproducible and available at https://github.com/srtaheri/Simplified_LVM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题