I've been working through the examples at http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html&comment-submitted#feedback and I got stuck trying to create a hash from the bits of the image after it's processed. If you hash the binary string created from the pixels of an image and then look at the hamming distance to analyze how different the photos are, what good is creating a hash doing a hamming distance vs. doing a hamming distance on the raw binary string? Is the hash created merely to speed things up?
I don't know much about hashes. I assume in this case they act as a filtering mechanism for nearly identical photos? But isn't this filtering accomplished by downsizing the photo and converting it to greyscale?
Idea presented in the blog post is how to recognize similar pictures. And goal is to lose right kind of information so that what is left is significant and easy to compare. So there are two aspects: how fast and how accurate can you compare. If you reduce your picture to 8x8 black and white (that is 64 bits of information), then it doesn't matter if you've call it a "raw bite string" or a "long hash" (well, as @Blender noted it's not really a hash in conventional use of the term). Important thing is how to reduce it and what information is left and what is lost.