## Locality Sensitive Hashing using Euclidean Distance

It’s quite similar to Locality Sensitive Hashing (LSH) for Cosine Similarity which we covered earlier. I will be referring to the same here, so it’s better if you go through the same before proceeding.

The difference lies in the way we compute hash value. As we have seen we can divide the region using planes. In each region we can have data-points.

Follow these steps (refer to diagram)

1. Divide the plane into small parts.
2. Project each data-point on the planes.
3. For each datapoint take the distance along each plane and use it to calculate the hash value.

Rest of the procedure is similar to cosine similarity process. Like finding the nearest neighbor.

## Locality Sensitive Hashing using Cosine Similarity

The problem we are trying to solve is to predict the class of a new data point, given a dataset with pre-classified data points.

Two key ideas we will use here are  k-NN algorithm and LSH. If you don’t know about these concepts then I will suggest you to check them out first.

What is Cosine Similarity?
At a high level cosine similarity can tell us how similar two points are. To do this we compute the vector representation for the two points and then find the angle between the two vectors. The similarity between vectors a and b can be given by cosine of the angle between them.

We can use this concept to calculate the hash value for a data point.

Now that we know cosine similarity we can use this to calculate LSH values for data points. To do this we divide the space using hyperplanes.

Refer to the image below to understand each of the points explained next.