computational-geometrynearest-neighborapproximationlocality-sensitive-hashapproximate-nn-searching

What is the ε (epsilon) parameter in Locality Sensitive Hashing (LSH)?


I've read the original paper about Locality Sensitive Hashing.

The complexity is in function of the parameter ε, but I don't understand what it is.

Can you explain its meaning please?


Solution

  • ε is the approximation parameter.

    LSH (as FLANN & kd-GeRaF) is designed for high dimensional data. In that space, k-NN doesn't work well, in fact it is almost as slow as brute force, because of the curse of dimensionality.

    For that reason, we focus on solving the aproximate k-NN. Check Definition 1 from our paper, which basically say that it's OK to return an approximate neighbor lying in (1 + ε) further distance than the exact neighbor.

    Check the image below:

    enter image description here

    here you see what does it mean finding the exact/approximate NN. In the traditional problem of NNS (Nearest Neighbor Search), we are asked to find the exact NN. In the modern problem, the approximate NNS, we are asked to find some neighbor inside the (1+ε) radius, thus either the exact or approximate NN would be a valid answer!

    So, with a high probability, LSH will return a NN inside that (1+ε) radius. For ε = 0, we actually solve the exact NN problem.