image-processingcbir

Matching image to images collection


I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?

Here's collection sample:

Here's what I'm trying to find:


Solution

  • Thank you for posting some photos.

    I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:

    Card vs. Abundance 79%
    Card vs. Aggressive 83%
    Card vs. Demystify 85%
    

    so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.

    I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.

    #!/bin/bash
    ################################################################################
    # Similarity
    # Mark Setchell
    #
    # Calculate percentage similarity of two images using Perceptual Hashing
    # See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
    #
    # Method:
    # 1) Resize image to black and white 8x8 pixel square regardless
    # 2) Calculate mean brightness of those 64 pixels
    # 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
    # 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
    #
    # If finding difference between Perceptual Hashes, simply total up number of bits
    # that differ between the two strings - this is the Hamming distance.
    #
    # Requires ImageMagick - www.imagemagick.org
    #
    # Usage:
    #
    # Similarity image|imageHash [image|imageHash]
    # If you pass one image filename, it will tell you the Perceptual hash as a 16
    # character hex string that you may want to store in an alternate stream or as
    # an attribute or tag in filesystems that support such things. Do this in order
    # to just calculate the hash once for each image.
    #
    # If you pass in two images, or two hashes, or an image and a hash, it will try
    # to compare them and give a percentage similarity between them.
    ################################################################################
    function PerceptualHash(){
    
       TEMP="tmp$$.png"
    
       # Force image to 8x8 pixels and greyscale
       convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"
    
       # Calculate mean brightness and correct to range 0..255
       MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)
    
       # Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
       hash=""
       for i in {0..7}; do
          for j in {0..7}; do
             pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
             bit="0"
             [ $pixel -gt $MEAN ] && bit="1"
             hash="$hash$bit"
          done
       done
       hex=$(echo "obase=16;ibase=2;$hash" | bc)
       printf "%016s\n" $hex
       #rm "$TEMP" > /dev/null 2>&1
    }
    
    function HammingDistance(){
       # Convert input hex strings to upper case like bc requires
       STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
       STR2=$(tr '[a-z]' '[A-Z]' <<< $2)
    
       # Convert hex to binary and zero left pad to 64 binary digits
       STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
       STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))
    
       # Calculate Hamming distance between two strings, each differing bit adds 1
       hamming=0
       for i in {0..63};do
          a=${STR1:i:1}
          b=${STR2:i:1}
          [ $a != $b ] && ((hamming++))
       done
    
       # Hamming distance is in range 0..64 and small means more similar
       # We want percentage similarity, so we do a little maths
       similarity=$((100-(hamming*100/64)))
       echo $similarity
    }
    
    function Usage(){
       echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
       exit 1
    }
    
    ################################################################################
    # Main
    ################################################################################
    if [ $# -eq 1 ]; then
       # Expecting a single image file for which to generate hash
       if [ ! -f "$1" ]; then
          echo "ERROR: File $1 does not exist" >&2
          exit 1
       fi
       PerceptualHash "$1" 
       exit 0
    fi
    
    if [ $# -eq 2 ]; then
       # Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
       if [ -f "$1" ]; then
          hash1=$(PerceptualHash "$1")
       else
          hash1=$1
       fi
       if [ -f "$2" ]; then
          hash2=$(PerceptualHash "$2")
       else
          hash2=$2
       fi
       HammingDistance $hash1 $hash2
       exit 0
    fi
    
    Usage