论文标题

$ k $ -MEANS的虚假当地微调结构

Structures of Spurious Local Minima in $k$-means

论文作者

Qian, Wei, Zhang, Yuqian, Chen, Yudong

论文摘要

$ k $ -Means聚类是无监督学习的基本问题。问题涉及将数据点的分区找到到$ K $群集中,以便将集群内变化最小化。尽管它的重要性和广泛的适用性,但对$ k $ - 均值问题的理论理解并没有完全令人满意。具有理论性能保证的现有算法通常依赖于复杂的(有时是人工)算法技术和对数据的限制假设。主要挑战在于问题的非凸性性质。特别是,除全球最佳距离以外,还有其他本地解决方案。此外,最简单,最受欢迎的算法是$ k $ -MEANS,即劳埃德的算法,通常在理论上和实践中都收敛于这种虚假的本地解决方案。 在本文中,我们从新的角度解决了$ k $ - 公元的问题,通过调查这些虚假的本地解决方案的结构,该结构具有$ k $地面真相簇的概率生成模型。即使$ k = 3 $,即使对于分离良好且平衡的簇,也存在虚假的本地最小值。这样的当地最低限度将两个中心放在一个真正的集群中,而在其他两个真正的群集中间的第三个中心。对于一般$ k $,一个本地最低限度将多个中心放在一个真正的集群中,一个中心位于多个真正的群集中间。也许令人惊讶的是,我们证明这本质上是在分离条件下唯一的杂种局部最小值。我们的结果涉及$ k $ -Means的配方,用于高斯或有限分布的混合物。我们的理论结果证实了现有的经验观察,并为$ k $ - 平均聚类的几种改进算法提供了理由。

$k$-means clustering is a fundamental problem in unsupervised learning. The problem concerns finding a partition of the data points into $k$ clusters such that the within-cluster variation is minimized. Despite its importance and wide applicability, a theoretical understanding of the $k$-means problem has not been completely satisfactory. Existing algorithms with theoretical performance guarantees often rely on sophisticated (sometimes artificial) algorithmic techniques and restricted assumptions on the data. The main challenge lies in the non-convex nature of the problem; in particular, there exist additional local solutions other than the global optimum. Moreover, the simplest and most popular algorithm for $k$-means, namely Lloyd's algorithm, generally converges to such spurious local solutions both in theory and in practice. In this paper, we approach the $k$-means problem from a new perspective, by investigating the structures of these spurious local solutions under a probabilistic generative model with $k$ ground truth clusters. As soon as $k=3$, spurious local minima provably exist, even for well-separated and balanced clusters. One such local minimum puts two centers at one true cluster, and the third center in the middle of the other two true clusters. For general $k$, one local minimum puts multiple centers at a true cluster, and one center in the middle of multiple true clusters. Perhaps surprisingly, we prove that this is essentially the only type of spurious local minima under a separation condition. Our results pertain to the $k$-means formulation for mixtures of Gaussians or bounded distributions. Our theoretical results corroborate existing empirical observations and provide justification for several improved algorithms for $k$-means clustering.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源