AI/Machine Learning

k-Nearest Neighbors in plain English

ByMin Wang

May 18, 2016

Here is how it works in plain English:

we have training set ( known features ( normalized), and classification) :

many data points: [ ( feature1,feature2,feature 3,…), ( f1,f2,f3 …), ….]

and corresponding labels/classification: [category1, 2, …]

for any new data point t

calculate the distance between this t to each of the training set data points

find/sort the K most near ( most similar) data points

–> take a majority vote from K ‘s label as the new label/class

The idea seems simple but it is quite powerful, one example is the handwriting recognition, e.g: handwriting for 1,2,… 9, with enough training sets, we can easily recognize some new handwriting!

From Wiki:

In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.