What is Weighted k-NN Algorithm

We have seen the basic k-NN algorithm. You may want to read that first to better understand this concept.

Using k-NN we try to predict the class of a query point “Q” (an unclassified data-point). To do this we use the concept of neighbours or points that are similar or close to the query point and then do majority vote.

But we miss something here. Suppose we have 5 neighbours. Don’t you think that the points which are closer to Q needs to be given more importance?

That’s what we do here. We assign weights to the data-points based on their distance from Q. One way to calculate weight is to divide distance by 1 i.e. weight = 1/distance

Consider the following neighbours of Q. Here we have added a new column Weight = 1/Distance

Point Label Distance Weight
A Positive 0.1 10
B Negative 0.2 5
C Positive 0.4 2.5
D Negative 0.5 2

Now that we have weights assigned to each neighbour we can use weighted k-NN to predict the Label for Q. To do this we calculate positive and negative sums.

Positive Sum
We calculate sum of weights for positive labels i.e. A,C : 10+2.5 = 12.5

Negative Sum
We calculate sum of weights for negative labels i.e. B,D : 5+2 = 7

Since Positive Sum > Negative Sum we can say that for Q the label will be Positive.