Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review

Haneen Arafat Abu Alfeilat; Ahmad B.A. Hassanat; Omar Lasassmeh; Ahmad S. Tarawneh; Mahmoud Bashir Alhasanat; Hamzeh S. Eyal Salman; V.B. Surya Prasath

doi:10.1089/big.2018.0175

What is it about?

The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested examples and the training examples. This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures available? This review attempts to answer this question through evaluating the performance (measured by accuracy, precision, and recall) of the KNN using a large number of distance measures, tested on a number of real-world data sets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, and the results showed large gaps between the performances of different distances. We found that a recently proposed nonconvex distance performed the best when applied on most data sets comparing with the other tested distances. In addition, the performance of the KNN with this top performing distance degraded only ∼20% while the noise level reaches 90%, this is true for most of the distances used as well. This means that the KNN classifier using any of the top 10 distances tolerates noise to a certain degree. Moreover, the results show that some distances are less affected by the added noise comparing with other distances.

Photo by Isaac Smith on Unsplash

Why is it important?

Distance measure directly impacts the k-nearest neighbor classifiers performance. Hence it is important to know the benefits and drawbacks of various distance measures. This work provides a comprehensive reference for k-nearest neighbor classifier with respect to a wide-variety of distance and similarity measures.

This page is a summary of: Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review, Big Data, August 2019, Mary Ann Liebert Inc,
DOI: 10.1089/big.2018.0175.
You can read the full text:

Read

Contributors

The following have contributed to this page

Surya Prasath
Cincinnati Children's Hospital Medical Center

Comparison of various distance measures for k-nearest neighbor classifier in machine learning

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Comparison of various distance measures for k-nearest neighbor classifier in machine learning

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management