image-processingcomputer-vision

how to find all shared regions between two images


I've been trying to find something that automatically finds all shared regions between two images, explicitly not based on pixel-matching or differencing, and I'm basically coming up with nothing after a fair bit of searching.

Say I have the following two images, in this case, website screenshots. The first the "baseline":

enter image description here

and the second very similar but with some modified CSS so entire blocks have been moved around. No text content changes, no box dimension changes, just some elements repositioned:

enter image description here

In this case (but also in literally every other case where two images where one is a derivative of the other are to be compared) their pixel diff is effectively useless for seeing what changed:

enter image description here

In fact, even if we apply some simple diff exaggeration, the result is still fairly useless because we're still looking at pixel diffs, instead of diffs based on what changed, so we won't (in any way) be looking at the actual modifications to the visual information:

enter image description here

So this is like comparing two books and then deciding the books are different based on how many values for n we can find for which book1.letters[n] != book2.letters[n]...

So, what I'm looking for is a way to compute regions of similarity, showing which parts of the two images encode the same information, but not necessarily in the same bounding boxes.

For instance, in the above two images, almost all the data is the same, just with some parts relocated. The only true difference is that there's mystery whitespace.

With similar regions color coded:

enter image description here

and the correspondence:

enter image description here

I can't find a single tool to do this, and I can't even find tutorials that allow for implementation of this using opencv or similar technologies. Maybe I'm looking for the wrong terms, maybe no one actually ever wrote an image comparison tool for this (which seems beyond belief?), so at the risk of this being off topic: I searched and researched as much as I can, here. What are my options if I need this as a tool that can be run as part of a normal (open source) tool chain for QA/testing? (so: not some expensive plugin to equally expensive commercial software).


Solution

  • To answer my own question: opencv (for python) paired with scikit-image can pretty much get us there in two steps.

    1. perform an SSIM comparison between the two images, capturing the various differences in bbox contours relative to the second image
    2. for each contour in the second image, perform template matching with respect to the first image, which tells us if a diff contour is a "change" or a "translation".

    In code, assuming two images imageA and imageB, with the same dimensions:

    import cv2
    import imutils
    from skimage.metrics import structural_similarity
    
    # ...a bunch of functions will be going here...
    
    diffs = compare(imageA, imageB, gray(imageA), gray(imageB), [])
    
    if len(diffs) > 0:
        highlight_diffs(imageA, imageB, diffs)
    
    else:
        print("no differences detected")
    

    with:

    def gray(img):
        return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    def compare(or1, or2, im1, img2, diffs):
        (score, diff) = structural_similarity(im1, img2, full=True)
        diff = (diff * 255).astype("uint8")
    
        thresh = cv2.threshold(diff, 0, 255,
            cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
        contours = cv2.findContours(thresh.copy(),
            cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        contours = imutils.grab_contours(contours)
    
        # aggregate the contours, throwing away duplicates
        for c in contours:
            (x, y, w, h) = cv2.boundingRect(c)
            region = [x, y, x + w, y + h]
            try:
                diffs.index(region)
            except ValueError:
                diffs.append(region)
    
        return diffs
    

    Now, cv2.RETR_EXTERNAL is supposed to only yield "External contours", e.g. if there's diffs inside other diffs (say a box's border color changed, and some text inside the box also changed), it should just yield one box, being the outer ("external") box.

    Except that's not what it does, so because I could get away with it I just wrote a really dumb function that weeds inner boxes with n² runtime behaviour (note: not O(n²), just straight up n². It's not a search algorithm, no "best case" vs. "worst case", just a literal "compare every element against every other element"):

    def filter_diffs(diffs):
        def not_contained(e, diffs):
            for t in diffs:
                if e[0] > t[0] and e[2] < t[2] and e[1] > t[1] and e[3] < t[3]:
                    return False
            return True
    
        return [e for e in diffs if not_contained(e, diffs)]
    

    which then gets used in the function that highlights the differences using color rectangles.

    RED = (0,0,255)
    
    def highlight_diffs(a, b, diffs):
        diffed = b.copy()
    
        for area in filter_diffs(diffs):
            x1, y1, x2, y2 = area
            cv2.rectangle(diffed, (x1, y1), (x2, y2), RED, 2)
    
        cv2.imshow("Diffed", diffed)
    

    This gets us the first part. Taking a screenshot of Stackoverflow, and then another screenshot after moving the left advertisement down, and recoloring the --yellow-100 CSS variable:

    This finds five diffs, but two of them aren't really "diffs" in the sense that it's new or removed content, but rather it's the result of "we moved a thing down".

    So, let's add in the template matching:

    def highlight_diffs(a, b, diffs):
        diffed = b.copy()
    
        for area in filter_diffs(diffs):
            x1, y1, x2, y2 = area
    
            # is this a relocation, or an addition/deletion?
            org = find_in_original(a, b, area)
            if org is not None:
                cv2.rectangle(a, (org[0], org[1]), (org[2], org[3]), BLUE, 2)
                cv2.rectangle(diffed, (x1, y1), (x2, y2), BLUE, 2)
            else:
                cv2.rectangle(diffed, (x1+2, y1+2), (x2-2, y2-2), GREEN, 1)
                cv2.rectangle(diffed, (x1, y1), (x2, y2), RED, 2)
    
        cv2.imshow("Original", a)
        cv2.imshow("Diffed", diffed)
        cv2.waitKey(0)
    

    With the following code for the template matching, with an incredibly strict threshold for "is the match we found actually good":

    def find_in_original(a, b, area):
        crop = b[area[1]:area[3], area[0]:area[2]]
        result = cv2.matchTemplate(crop, a, cv2.TM_CCOEFF_NORMED)
    
        (minVal, maxVal, minLoc, maxLoc) = cv2.minMaxLoc(result)
        (startX, startY) = maxLoc
        endX = startX + (area[2] - area[0])
        endY = startY + (area[3] - area[1])
        ocrop = a[startY:endY, startX:endX]
    
        # this basically needs to be a near-perfect match
        # for us to consider it a "moved" region rather than
        # a genuine difference between A and B.
        if structural_similarity(gray(ocrop), gray(crop)) >= 0.99:
            return [startX, startY, endX, endY]
    

    We can now compare the original and modified image, and see that the ad got moved in the modified image, rather than being "new content", and we can see where it can be found in the original:

    And that's it, we have a visual diff that actually tells us something useful about the changes, rather than telling us which pixel happens to be a different colour.

    We could bring down the template matching threshold down a little to, say, 0.95, in which case the whitespace box would also end up matched to the original image, but because it's just whitespace it'll get matched to something mostly meaningless (in this particular case, it'll match it to the whitespace in the lower right of the original).

    Of course, quality of life improvements would be to cycle through colours so that various moved parts can all be related to each other by their shared colour, but that's the kind of thing that anyone can probably tack on top of this code themselves.

    Should anyone want to play with this code, the repo's over on https://github.com/Pomax/ci-image-diff which houses both the diffing code (diff.py) as well as a larger "compare" script for running visual diffing as part of CI passes.