论文标题
MCMC的贝叶斯学习神经网络的痕迹级高斯先生
Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC
论文作者
论文摘要
本文在$ \ mathbb r^d $上介绍了一个新的神经网络,基于实际有价值的功能,与通常的karhunen-loève功能空间相比,该构造更容易,更便宜地缩放在域尺寸$ d $中。新的先验是高斯神经网络先验,每个重量和偏见都有一个独立的高斯先验,但是关键差异是,差异的差异是该网络宽度的减小,以至于所得函数在无限宽度网络的极限下\ emph {几乎肯定地肯定地定义了。我们表明,在推断未知功能的贝叶斯治疗中,使用希尔伯特太空马尔可夫链蒙特卡洛(MCMC)方法,诱导的后验与蒙特卡洛采样。这种类型的MCMC很受欢迎,例如在贝叶斯逆问题文献中,因为它在\ emph {网状细化}下是稳定的,即,由于引入了函数之前的更多参数,即使是\ emph {ad ad infinitum},因此接受概率不会收缩至$ 0 $。在数值示例中,我们证明了这些竞争优势与其他功能空间先验相比。我们还实施了贝叶斯强化学习中的示例,以从数据中自动化任务,并首次证明MCMC稳定在这些类型的问题上进行了精炼。
This paper introduces a new neural network based prior for real valued functions on $\mathbb R^d$ which, by construction, is more easily and cheaply scaled up in the domain dimension $d$ compared to the usual Karhunen-Loève function space prior. The new prior is a Gaussian neural network prior, where each weight and bias has an independent Gaussian prior, but with the key difference that the variances decrease in the width of the network in such a way that the resulting function is \emph{almost surely} well defined in the limit of an infinite width network. We show that in a Bayesian treatment of inferring unknown functions, the induced posterior over functions is amenable to Monte Carlo sampling using Hilbert space Markov chain Monte Carlo (MCMC) methods. This type of MCMC is popular, e.g. in the Bayesian Inverse Problems literature, because it is stable under \emph{mesh refinement}, i.e. the acceptance probability does not shrink to $0$ as more parameters of the function's prior are introduced, even \emph{ad infinitum}. In numerical examples we demonstrate these stated competitive advantages over other function space priors. We also implement examples in Bayesian Reinforcement Learning to automate tasks from data and demonstrate, for the first time, stability of MCMC to mesh refinement for these type of problems.