论文标题
MSPP:一种用于采矿相似点的高效且可扩展的算法
MSPP: A Highly Efficient and Scalable Algorithm for Mining Similar Pairs of Points
论文作者
论文摘要
最接近的点问题或最接近的一对问题(CPP)是计算几何形状中的一个重要问题,我们必须从它们之间最小距离的度量空间中的一组点中找到一对点。这个问题出现在许多应用中,例如但不限于聚类,图形分区,图像处理,模式识别和入侵检测。例如,在空运控制中,我们必须监视太近的飞机,因为这可能可能表明可能发生碰撞。已经提出了许多用于求解CPP的算法。实践中采用的算法具有最坏的情况二次运行时间复杂性。在本文中,我们提出了一种称为MSPP的CPP的优雅近似算法:采矿相似的点。它比当前最著名的算法快,同时保持了非常良好的准确性。拟议的算法还检测到欧几里得和皮尔逊度量空间中的一对非常相似的点对,并且可以在许多现实世界应用中进行调整,例如聚类,降低尺寸降低,构建和分析基因/转录本共表达网络等。
The closest pair of points problem or closest pair problem (CPP) is an important problem in computational geometry where we have to find a pair of points from a set of points in metric space with the smallest distance between them. This problem arises in a number of applications, such as but not limited to clustering, graph partitioning, image processing, patterns identification, and intrusion detection. For example, in air-traffic control, we must monitor aircrafts that come too close together, since this may potentially indicate a possible collision. Numerous algorithms have been presented for solving the CPP. The algorithms that are employed in practice have a worst case quadratic run time complexity. In this article we present an elegant approximation algorithm for the CPP called MSPP: Mining Similar Pairs of Points. It is faster than currently best known algorithms while maintaining a very good accuracy. The proposed algorithm also detects a set of closely similar pairs of points in Euclidean and Pearson metric spaces and can be adapted in numerous real world applications, such as clustering, dimension reduction, constructing and analyzing gene/transcript co-expression network, among others.