论文标题
通过连接到贝叶斯非参数混合模型的Neyman-Scott过程的时空聚类
Spatiotemporal Clustering with Neyman-Scott Processes via Connections to Bayesian Nonparametric Mixture Models
论文作者
论文摘要
Neyman-Scott过程(NSP)是生成时间或空间点群的点过程模型。它们是各种现象的天然模型,从神经尖峰火车到记录流。聚类属性是通过双重随机配方来实现的:首先,一组潜在事件是从泊松过程中得出的;然后,每个潜在事件都会根据另一个泊松过程生成一组观察到的数据点。这种结构类似于贝叶斯非参数混合模型,例如Dirichlet工艺混合模型(DPMM),因为潜在事件的数量(即簇)是一个随机变量,但是点过程公式使NSP特别适合对时空数据进行建模。尽管已经为DPMM开发了许多专门的算法,但相对较少的工作重点是NSP的推断。在这里,我们介绍了NSP和DPMM之间的新型连接,关键链接是第三类贝叶斯混合模型,称为有限混合模型(MFMMS)的混合物。利用这种连接,我们将标准折叠的Gibbs采样算法用于DPMMS,以在NSP模型上启用可扩展的贝叶斯推断。我们证明了Neyman-Scott过程对各种应用的潜力,包括神经尖峰火车中的序列检测以及文档流中的事件检测。
Neyman-Scott processes (NSPs) are point process models that generate clusters of points in time or space. They are natural models for a wide range of phenomena, ranging from neural spike trains to document streams. The clustering property is achieved via a doubly stochastic formulation: first, a set of latent events is drawn from a Poisson process; then, each latent event generates a set of observed data points according to another Poisson process. This construction is similar to Bayesian nonparametric mixture models like the Dirichlet process mixture model (DPMM) in that the number of latent events (i.e. clusters) is a random variable, but the point process formulation makes the NSP especially well suited to modeling spatiotemporal data. While many specialized algorithms have been developed for DPMMs, comparatively fewer works have focused on inference in NSPs. Here, we present novel connections between NSPs and DPMMs, with the key link being a third class of Bayesian mixture models called mixture of finite mixture models (MFMMs). Leveraging this connection, we adapt the standard collapsed Gibbs sampling algorithm for DPMMs to enable scalable Bayesian inference on NSP models. We demonstrate the potential of Neyman-Scott processes on a variety of applications including sequence detection in neural spike trains and event detection in document streams.