加权的共同k-near最邻居用于分类挖掘

论文标题

加权的共同k-near最邻居用于分类挖掘

A Weighted Mutual k-Nearest Neighbour for Classification Mining

论文作者

Dhar, Joydip, Shukla, Ashaya, Kumar, Mukul, Gupta, Prashant

论文摘要

KNN是一种非常有效的基于实例的学习方法，并且易于实现。由于数据的异质性质，来自不同可能来源的噪音在本质上也很普遍，尤其是在大规模数据库的情况下。为了消除伪邻居的噪声和影响，在本文中，我们提出了一种新的学习算法，该算法执行了从数据集中对伪邻居进行异常检测和去除的任务，以提供比较的更好结果。该算法还试图最大程度地减少那些遥远的邻居的影响。还引入了确定性度量的概念以进行实验结果。使用相互邻居的概念和距离加权投票的优点是，在删除异常和权重概念强迫考虑更多考虑这些邻居（更接近的邻居）之后，数据集将得到完善。因此，最终计算了提出的算法的性能。

kNN is a very effective Instance based learning method, and it is easy to implement. Due to heterogeneous nature of data, noises from different possible sources are also widespread in nature especially in case of large-scale databases. For noise elimination and effect of pseudo neighbours, in this paper, we propose a new learning algorithm which performs the task of anomaly detection and removal of pseudo neighbours from the dataset so as to provide comparative better results. This algorithm also tries to minimize effect of those neighbours which are distant. A concept of certainty measure is also introduced for experimental results. The advantage of using concept of mutual neighbours and distance-weighted voting is that, dataset will be refined after removal of anomaly and weightage concept compels to take into account more consideration of those neighbours, which are closer. Consequently, finally the performance of proposed algorithm is calculated.

下载PDF全文

下载文献需遵守相关版权规定

论文标题