论文标题
基于梯度的聚类
Gradient Based Clustering
论文作者
论文摘要
我们使用成本函数的梯度提出了一种基于距离的聚类的通用方法,该梯度可以测量相对于群集分配和聚类中心位置的聚类质量。该方法是迭代两步过程(在群集分配和群集中心更新之间交替),并且适用于广泛的功能,满足了一些温和的假设。提出的方法的主要优点是简单且计算廉价的更新规则。与以前专门针对聚类问题的特定表述的方法不同,我们的方法适用于广泛的成本,包括基于Huber损失的非BREGMAN聚类方法。我们分析了所提出的算法的收敛性,并表明它在任意中心初始化下将其收敛到适当定义的固定点的集合。在布雷格曼成本功能的特殊情况下,该算法会收敛到质心Voronoi分区集,这与先前的工作一致。对实际数据的数值实验证明了该方法的有效性。
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality with respect to cluster assignments and cluster center positions. The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions, satisfying some mild assumptions. The main advantage of the proposed approach is a simple and computationally cheap update rule. Unlike previous methods that specialize to a specific formulation of the clustering problem, our approach is applicable to a wide range of costs, including non-Bregman clustering methods based on the Huber loss. We analyze the convergence of the proposed algorithm, and show that it converges to the set of appropriately defined fixed points, under arbitrary center initialization. In the special case of Bregman cost functions, the algorithm converges to the set of centroidal Voronoi partitions, which is consistent with prior works. Numerical experiments on real data demonstrate the effectiveness of the proposed method.