论文标题
高斯的分层混合物,用于降低维度的合并性和聚类
Hierarchical mixtures of Gaussians for combined dimensionality reduction and clustering
论文作者
论文摘要
我们引入了高斯(HMOGS)的分层混合物,该混合物将降低尺寸降低和聚类统一为单个概率模型。 HMOGS为模型可能性,潜在状态和群集成员的精确推断提供了封闭形式的表达式,以及最大可能优化的精确算法。 HMOG的新型指数家族参数化大大降低了其相对于基于类似模型的方法的计算复杂性,从而使它们能够有效地对数百个潜在维度建模,从而在高维数据中捕获其他结构。我们演示了有关合成实验和MNIST的HMOG,并展示了降低和聚类的关节优化如何促进模型性能的提高。我们还探讨了稀疏性约束的降低如何在鼓励可解释性的同时进一步提高聚类性能。通过与现代数据和计算的规模桥接经典的统计建模,HMOGS为高维聚类提供了一种实用的方法,可保留统计严格,可解释性和不确定性量化,而基于嵌入的,变异性和自我求助方法通常会缺少这种方法。
We introduce hierarchical mixtures of Gaussians (HMoGs), which unify dimensionality reduction and clustering into a single probabilistic model. HMoGs provide closed-form expressions for the model likelihood, exact inference over latent states and cluster membership, and exact algorithms for maximum-likelihood optimization. The novel exponential family parameterization of HMoGs greatly reduces their computational complexity relative to similar model-based methods, allowing them to efficiently model hundreds of latent dimensions, and thereby capture additional structure in high-dimensional data. We demonstrate HMoGs on synthetic experiments and MNIST, and show how joint optimization of dimensionality reduction and clustering facilitates increased model performance. We also explore how sparsity-constrained dimensionality reduction can further improve clustering performance while encouraging interpretability. By bridging classical statistical modelling with the scale of modern data and compute, HMoGs offer a practical approach to high-dimensional clustering that preserves statistical rigour, interpretability, and uncertainty quantification that is often missing from embedding-based, variational, and self-supervised methods.