论文标题
基于比较器的最近邻居下降的经验复杂性
Empirical complexity of comparator-based nearest neighbor descent
论文作者
论文摘要
使用自然统计终止标准提出了$ k $ neart neirek discent算法的Java平行流实现。输入数据由类型V的$ n $对象的集合$ s $组成,而函数<v,比较器<v >>,它允许s $中的任何$ x \ in s $中的任何$ x \决定$ y,z \ in s \ setminus \ {x \} $的哪个与$ x $相似。 Kullback-Leibler Divergence比较器的实验支持以下预测:$ k $ neart的邻居更新的回合数不需要超过两倍的未定向版本的直径,即$ n $ n $ Vertices上的随机常规超级$ k $ digraph。在研究的示例类中,总体复杂性为$ O(N K^2 \ log_k(n))$。当对象从$ d $维单纯形均匀采样时,$ k $ neart的邻居近似的准确性高达$ d = 20 $,但正如理论所预测的那样,较高的维度下降。
A Java parallel streams implementation of the $K$-nearest neighbor descent algorithm is presented using a natural statistical termination criterion. Input data consist of a set $S$ of $n$ objects of type V, and a Function<V, Comparator<V>>, which enables any $x \in S$ to decide which of $y, z \in S\setminus\{x\}$ is more similar to $x$. Experiments with the Kullback-Leibler divergence Comparator support the prediction that the number of rounds of $K$-nearest neighbor updates need not exceed twice the diameter of the undirected version of a random regular out-degree $K$ digraph on $n$ vertices. Overall complexity was $O(n K^2 \log_K(n))$ in the class of examples studied. When objects are sampled uniformly from a $d$-dimensional simplex, accuracy of the $K$-nearest neighbor approximation is high up to $d = 20$, but declines in higher dimensions, as theory would predict.