pythonopencvopencv3.0template-matching

How to align two images based on a common feature with matchTemplate


I have two images which overlap. I'd like to align these two images. My current approach is to find a common feature (a marking) in both images. I'd then like to align these two images according to the place where the feature overlaps.

The images aren't perfect, so I'm looking for some way that will align based the 'best' fit (most overlap). Originally I tried to align the images using feature matching through SIFT but the features matches were often incorrect/too few.

Here's the code I used to find the template:

template = cv2.imread('template.png', 0)
template = template - cv2.erode(template, None)

image1 = cv2.imread('Image to align1.png')
image2 = cv2.imread('Image to align2.png')
image = image2
img2 = image[:,:,2]
img2 = img2 - cv2.erode(img2, None)

ccnorm = cv2.matchTemplate(img2, template, cv2.TM_CCORR_NORMED)
print(ccnorm.max())
loc = np.where(ccnorm == ccnorm.max())
print(loc)
threshold = 0.1
th, tw = template.shape[:2]
for pt in zip(*loc[::-1]):
    if ccnorm[pt[::-1]] < threshold:
        continue
    cv2.rectangle(image, pt, (pt[0] + tw, pt[1] + th),
                 (0, 0, 255), 2)

Here are the matched features, 1 and 2. Thanks in advance.


Solution

  • Your choices with the OpenCV library is to use any number of methods to select a few points, and create the transformation between those points in the image by using a function like getAffineTransform or getPerspectiveTransform. Note that functions like these take points as arguments, not luminosity values (images). You'll want to find points of interest in the first image (say, those marker spots); and you'll want to find those same points in the second image, and pass those pixel locations to a function like getAffineTransform or getPerspectiveTransform. Then, once you have that transformation matrix, you can use warpAffine or warpPerspective to warp the second image into the coordinates of the first (or vice versa).

    Affine transformations include translation, rotation, scaling, and shearing. Perspective transformations include everything from affine transformations, as well as perspective distortion in the x and y directions. For getAffineTransform you need to send three pairs of points from the first image, and where those three same pixels are located in the second image. For getPerspectiveTransform, you will send four pixel pairs from each image. If you want to use all of your marker points, you can use findHomography instead which will allow you to place more than four points and it will compute an optimal homography between all of your matched points.

    When you use feature detection and matching to align images, it's using these functions in the background. The difference is it finds the features for you. But if that's not working, simply use manual methods to find the features to your liking, and then use these methods on those feature points. E.g., you could find the template locations as you already have and define that as a region of interest (ROI), and then break the marker points into smaller template pieces and find those locations inside your ROI. Then you have corresponding pairs of points from both images; you can input their locations into findHomography or just pick three to use with getAffineTransform or four with getPerspectiveTransform and you'll get your image transformation which you can then apply.


    Otherwise you'll need to use something like the Lukas-Kanade optical flow algorithm which can do direct image matching if you don't want to use feature-based methods, but these are incredibly slow comparatively to selecting a few feature points and finding homographies that way if you use the whole image. However if you only have to do it for a few images, it's not such a huge deal. To be more accurate and have it converge much faster, it'll help if you can provide it a starting homography that at least translates it roughly to the right position (e.g. you do your feature detection, see that the feature is roughly (x', y') pixels in the second image from the first, and create an homography with that translation).

    You can also likely find some Python routines for homography estimation from the Lucas-Kanade inverse compositional algorithm or the like online if you want to try that. I have my own custom routine for that algorithm as well, but I can't share it, however, I could run the algorithm on your images if you share the originals without the bounding boxes, to maybe provide you with some estimated homographies to compare with.