pythonimage-processingbackground-subtractionimage-comparisontimelapse

How can I quantify difference between two images?


Here's what I would like to do:

I'm taking pictures with a webcam at regular intervals. Sort of like a time lapse thing. However, if nothing has really changed, that is, the picture pretty much looks the same, I don't want to store the latest snapshot.

I imagine there's some way of quantifying the difference, and I would have to empirically determine a threshold.

I'm looking for simplicity rather than perfection. I'm using python.


Solution

  • General idea

    Option 1: Load both images as arrays (scipy.misc.imread) and calculate an element-wise (pixel-by-pixel) difference. Calculate the norm of the difference.

    Option 2: Load both images. Calculate some feature vector for each of them (like a histogram). Calculate distance between feature vectors rather than images.

    However, there are some decisions to make first.

    Questions

    You should answer these questions first:

    Example

    I assume your images are well-aligned, the same size and shape, possibly with different exposure. For simplicity, I convert them to grayscale even if they are color (RGB) images.

    You will need these imports:

    import sys
    
    from scipy.misc import imread
    from scipy.linalg import norm
    from scipy import sum, average
    

    Main function, read two images, convert to grayscale, compare and print results:

    def main():
        file1, file2 = sys.argv[1:1+2]
        # read images as 2D arrays (convert to grayscale for simplicity)
        img1 = to_grayscale(imread(file1).astype(float))
        img2 = to_grayscale(imread(file2).astype(float))
        # compare
        n_m, n_0 = compare_images(img1, img2)
        print "Manhattan norm:", n_m, "/ per pixel:", n_m/img1.size
        print "Zero norm:", n_0, "/ per pixel:", n_0*1.0/img1.size
    

    How to compare. img1 and img2 are 2D SciPy arrays here:

    def compare_images(img1, img2):
        # normalize to compensate for exposure difference, this may be unnecessary
        # consider disabling it
        img1 = normalize(img1)
        img2 = normalize(img2)
        # calculate the difference and its norms
        diff = img1 - img2  # elementwise for scipy arrays
        m_norm = sum(abs(diff))  # Manhattan norm
        z_norm = norm(diff.ravel(), 0)  # Zero norm
        return (m_norm, z_norm)
    

    If the file is a color image, imread returns a 3D array, average RGB channels (the last array axis) to obtain intensity. No need to do it for grayscale images (e.g. .pgm):

    def to_grayscale(arr):
        "If arr is a color image (3D array), convert it to grayscale (2D array)."
        if len(arr.shape) == 3:
            return average(arr, -1)  # average over the last axis (color channels)
        else:
            return arr
    

    Normalization is trivial, you may choose to normalize to [0,1] instead of [0,255]. arr is a SciPy array here, so all operations are element-wise:

    def normalize(arr):
        rng = arr.max()-arr.min()
        amin = arr.min()
        return (arr-amin)*255/rng
    

    Run the main function:

    if __name__ == "__main__":
        main()
    

    Now you can put this all in a script and run against two images. If we compare image to itself, there is no difference:

    $ python compare.py one.jpg one.jpg
    Manhattan norm: 0.0 / per pixel: 0.0
    Zero norm: 0 / per pixel: 0.0
    

    If we blur the image and compare to the original, there is some difference:

    $ python compare.py one.jpg one-blurred.jpg 
    Manhattan norm: 92605183.67 / per pixel: 13.4210411116
    Zero norm: 6900000 / per pixel: 1.0
    

    P.S. Entire compare.py script.

    Update: relevant techniques

    As the question is about a video sequence, where frames are likely to be almost the same, and you look for something unusual, I'd like to mention some alternative approaches which may be relevant:

    I strongly recommend taking a look at “Learning OpenCV” book, Chapters 9 (Image parts and segmentation) and 10 (Tracking and motion). The former teaches to use Background subtraction method, the latter gives some info on optical flow methods. All methods are implemented in OpenCV library. If you use Python, I suggest to use OpenCV ≥ 2.3, and its cv2 Python module.

    The most simple version of the background subtraction:

    More advanced versions make take into account time series for every pixel and handle non-static scenes (like moving trees or grass).

    The idea of optical flow is to take two or more frames, and assign velocity vector to every pixel (dense optical flow) or to some of them (sparse optical flow). To estimate sparse optical flow, you may use Lucas-Kanade method (it is also implemented in OpenCV). Obviously, if there is a lot of flow (high average over max values of the velocity field), then something is moving in the frame, and subsequent images are more different.

    Comparing histograms may help to detect sudden changes between consecutive frames. This approach was used in Courbon et al, 2010:

    Similarity of consecutive frames. The distance between two consecutive frames is measured. If it is too high, it means that the second frame is corrupted and thus the image is eliminated. The Kullback–Leibler distance, or mutual entropy, on the histograms of the two frames:

    $$ d(p,q) = \sum_i p(i) \log (p(i)/q(i)) $$

    where p and q are the histograms of the frames is used. The threshold is fixed on 0.2.