Locality Sensitive Hashing using Euclidean Distance

It’s quite similar to Locality Sensitive Hashing (LSH) for Cosine Similarity which we covered earlier. I will be referring to the same here, so it’s better if you go through the same before proceeding.

The difference lies in the way we compute hash value. As we have seen we can divide the region using planes. In each region we can have data-points.

Follow these steps (refer to diagram)

LSH for Euclidean
LSH for Euclidean

1. Divide the plane into small parts.
2. Project each data-point on the planes.
3. For each datapoint take the distance along each plane and use it to calculate the hash value.

Rest of the procedure is similar to cosine similarity process. Like finding the nearest neighbor.

Author: Ankur

An Engineer by choice

Leave a Reply

Your email address will not be published. Required fields are marked *