What is Locality Sensitive Hashing

It’s a beautiful technique to find similar points that is nearest neighbours. It uses the concept of Hashing for this.

You may be familiar with how Hash Table is constructed. If not, please follow this link for a quick refresher on Hashing concepts. It’s a very efficient data structure which allows us to perform operations in O(1) time.

How it Works?

We apply a hash function to each available data point. This gives us the bucket or the key where the point can be placed.

We try to maximize collisions, so that similar points go to same bucket.

Now if we get an unlabelled query point and apply the hash function then it will go to same bucket where similar points exist.

This makes it easy to find the nearest neighbors for the query point. And subsequently we can apply k-NN to predict it’s class.