论文标题
贝叶斯控制变体,用于最佳协方差估计,对成对模拟和替代物
Bayesian Control Variates for optimal covariance estimation with pairs of simulations and surrogates
论文作者
论文摘要
摘要统计的平均值和协方差矩阵的预测对于与观测的面对宇宙学理论至关重要,这尤其是可能性近似和参数推断。准确估计的价格是运行$ n $ body和流体动力学模拟的极端成本。近似求解器或代理大大降低了计算成本,但可以引入明显的偏见,例如在宇宙结构生长的非线性状态下。我们提出了“拼车贝叶斯”,这是一种使用模拟和替代物组合解决平均值和协方差的推理问题的方法。我们的框架允许合并均值和协方差的先前信息。我们得出了封闭形式的溶液,以最大程度地估计有效的贝叶斯收缩估计器,保证阳性半定义,并且可以选择利用分析协方差近似。我们讨论了先验的选择,并提出了一个简单的程序,用于使用少量的测试模拟获得最佳的先验高参数值。我们通过使用100-1000 $ \ times $ from Qualter粒子网格代码来估算小时$ z = 0.5 $的小时$ z = 0.5 $的小时$ z = 0.5 $的聚类统计的协方差来测试我们的方法。从15,000个模拟中获取样品协方差,并使用对角块的经验贝叶斯,我们的估计器以$λ$ CDM参数的形式生产几乎相同的Fisher矩阵轮廓,仅使用$ 15 $的非线性暗物质功率谱图。在这种情况下,模拟的数量是如此之小,以至于样本协方差将退化。我们显示的案例即使我们的方法幼稚,我们的方法仍然可以改善估计值。我们的框架适用于可以快速替代物的各种宇宙学和天体物理问题。
Predictions of the mean and covariance matrix of summary statistics are critical for confronting cosmological theories with observations, not least for likelihood approximations and parameter inference. The price to pay for accurate estimates is the extreme cost of running $N$-body and hydrodynamics simulations. Approximate solvers, or surrogates, greatly reduce the computational cost but can introduce significant biases, for example in the non-linear regime of cosmic structure growth. We propose "CARPool Bayes", an approach to solve the inference problem for both the means and covariances using a combination of simulations and surrogates. Our framework allows incorporating prior information for the mean and covariance. We derive closed-form solutions for Maximum A Posteriori covariance estimates that are efficient Bayesian shrinkage estimators, guarantee positive semi-definiteness, and can optionally leverage analytical covariance approximations. We discuss choices of the prior and propose a simple procedure for obtaining optimal prior hyperparameter values with a small set of test simulations. We test our method by estimating the covariances of clustering statistics of GADGET-III $N$-body simulations at redshift $z=0.5$ using surrogates from a 100-1000$\times$ faster particle-mesh code. Taking the sample covariance from 15,000 simulations as the truth, and using an empirical Bayes prior with diagonal blocks, our estimator produces nearly identical Fisher matrix contours for $Λ$CDM parameters using only $15$ simulations of the non-linear dark matter power spectrum. In this case the number of simulations is so small that the sample covariance would be degenerate. We show cases where even with a naïve prior our method still improves the estimate. Our framework is applicable to a wide range of cosmological and astrophysical problems where fast surrogates are available.