精确汇总本地模型

论文标题

精确汇总本地模型

Precision Aggregated Local Models

论文作者

Edwards, Adam M., Gramacy, Robert B.

论文摘要

大规模高斯工艺（GP）回归对于使用协方差矩阵而言涉及的差异和二次存储，对于较大的数据集而言是不可行的。最近的文献中的补救措施集中在分裂和争议上，例如将其划分为子问题，并诱导功能（以及计算）独立性。这样的近似值可以比普通的GPS快速，准确，有时甚至更灵活。但是，一个很大的缺点是分区边界处的连续性丧失。诸如局部近似GPS（lagps）之类的现代方法意味着有效地分配了无限的分区，因此在这方面的病理和坏处是好的。模型平均，是分裂和串扰的替代方案，可以保持绝对的连续性，但通常会过度光滑，精度降低。在这里，我们建议将类似LAGP的方法放入类似本地专家的框架中，将基于分区的速度与模型平均连续性融合在一起，这是我们所说的精确汇总本地模型（Palm）的旗舰示例。使用$ k $ lagps，每个从$ n $总数据对中选择$ n $，我们说明了一种方案，该方案最多是$ n $的，$ n $ in $ k $，$ k $，$ n $线性，大大降低了计算和存储需求。广泛的经验例证表明，棕榈的速度至少与LAGP一样准确，在速度方面可以更快，并提供连续的预测表面。最后，我们提出了顺序更新方案，该方案贪婪地将手掌预测器优化为计算预算。

Large scale Gaussian process (GP) regression is infeasible for larger data sets due to cubic scaling of flops and quadratic storage involved in working with covariance matrices. Remedies in recent literature focus on divide-and-conquer, e.g., partitioning into sub-problems and inducing functional (and thus computational) independence. Such approximations can be speedy, accurate, and sometimes even more flexible than an ordinary GPs. However, a big downside is loss of continuity at partition boundaries. Modern methods like local approximate GPs (LAGPs) imply effectively infinite partitioning and are thus pathologically good and bad in this regard. Model averaging, an alternative to divide-and-conquer, can maintain absolute continuity but often over-smooths, diminishing accuracy. Here we propose putting LAGP-like methods into a local experts-like framework, blending partition-based speed with model-averaging continuity, as a flagship example of what we call precision aggregated local models (PALM). Using $K$ LAGPs, each selecting $n$ from $N$ total data pairs, we illustrate a scheme that is at most cubic in $n$, quadratic in $K$, and linear in $N$, drastically reducing computational and storage demands. Extensive empirical illustration shows how PALM is at least as accurate as LAGP, can be much faster in terms of speed, and furnishes continuous predictive surfaces. Finally, we propose sequential updating scheme which greedily refines a PALM predictor up to a computational budget.

下载PDF全文

下载文献需遵守相关版权规定

论文标题