I am reading about binary descriptors but can't understand how they work. So far I have understood that the motivation is to create descriptors that can be computed and matched very fast. We sample n pairs of points, compare intensity at each of the n location pairs and construct a n-length vector. We then use the hamming distance as similarity measure between two patch's descriptors.
Suppose I am comparing two exactly same patches using their binary descriptor. Since I sample the n pairs independently both the times, there doesn't seem any correlation between the similarity of the patches and the feature vector. I may have sampled n pairs in first patch and the same n pairs in reverse order in the second patch and the resulting hamming distance would be n.
I have read the BRIEF paper.
The implementation typically hard-encodes one single "random" order for the pairs, that is then used to calculate the descriptor for every patch.
For BRIEF in OpenCV you can find it on github.