I'm looking for a good panorama stitching library for text. I tried OpenCV and OpenPano. They both work good on regular photos, but fail on text. For example I need to stitch the following 3 images:
The images have about 45% overlapping between each other.
If there's an option to make one of the mentioned libraries work good on text images, instead of finding another library, that would be great.
OpenPano fails at stitching text because it cannot retrieve enough feature points (or keypoints) to do the stitching process.
Text stitching doesn't need a matching method that is robust to rotations but only to translations. OpenCV conveniently offers such a function. It is called : Template Matching.
The solution I will develop is based on this OpenCV's feature.
I will now explain the main steps of my solution (for further details, please have a look at the code provided bellow).
In order to match two consecutive images (done in the matchImages
function, see code bellow):
H_templ_ratio
) of the first image as depicted bellow:This step is done in my code by the function genTemplate
.
The template image could theoretically be found anywhere in this margined image. This process is done in the addBlackMargins
function.
Mat2Edges
function). This will add robustness to the matching process. Here is an example:matchTemplate
and we retrieve the best match location with the minMaxLoc
function.This step consists in calculating the size of the final matrix where we will stitch all the images together. This is particularly needed if all the input images don't have the same height.
This step is done inside the calcFinalImgSize
function. I won't get into to much details here because even though it looks a bit complex (for me at least), this is only simple maths (additions, subtractions, multiplications). Take a pen and paper if you want to understand the formulas.
Once we have the match locations for each input images, we only have to do simple maths to copy the input images in the right spot of the final image. Again, I recommend you to check the code for implementation details (see stitchImages
function).
Here is the result with your input images:
As you can see, the result is not "pixel perfect" but it should be good enough for OCR.
And here is another result with input images of different heights:
My program is written in Python and uses cv2
(OpenCV) and numpy
modules. However it can be ported (easily) in other languages such as C++, Java and C#.
import numpy as np
import cv2
def genTemplate(img):
global H_templ_ratio
# we get the image's width and height
h, w = img.shape[:2]
# we compute the template's bounds
x1 = int(float(w)*(1-H_templ_ratio))
y1 = 0
x2 = w
y2 = h
return(img[y1:y2,x1:x2]) # and crop the input image
def mat2Edges(img): # applies a Canny filter to get the edges
edged = cv2.Canny(img, 100, 200)
return(edged)
def addBlackMargins(img, top, bottom, left, right): # top, bottom, left, right: margins width in pixels
h, w = img.shape[:2]
result = np.zeros((h+top+bottom, w+left+right, 3), np.uint8)
result[top:top+h,left:left+w] = img
return(result)
# return the y_offset of the first image to stitch and the final image size needed
def calcFinalImgSize(imgs, loc):
global V_templ_ratio, H_templ_ratio
y_offset = 0
max_margin_top = 0; max_margin_bottom = 0 # maximum margins that will be needed above and bellow the first image in order to stitch all the images into one mat
current_margin_top = 0; current_margin_bottom = 0
h_init, w_init = imgs[0].shape[:2]
w_final = w_init
for i in range(0,len(loc)):
h, w = imgs[i].shape[:2]
h2, w2 = imgs[i+1].shape[:2]
# we compute the max top/bottom margins that will be needed (relatively to the first input image) in order to stitch all the images
current_margin_top += loc[i][1] # here, we assume that the template top-left corner Y-coordinate is 0 (relatively to its original image)
current_margin_bottom += (h2 - loc[i][1]) - h
if(current_margin_top > max_margin_top): max_margin_top = current_margin_top
if(current_margin_bottom > max_margin_bottom): max_margin_bottom = current_margin_bottom
# we compute the width needed for the final result
x_templ = int(float(w)*H_templ_ratio) # x-coordinate of the template relatively to its original image
w_final += (w2 - x_templ - loc[i][0]) # width needed to stitch all the images into one mat
h_final = h_init + max_margin_top + max_margin_bottom
return (max_margin_top, h_final, w_final)
# match each input image with its following image (1->2, 2->3)
def matchImages(imgs, templates_loc):
for i in range(0,len(imgs)-1):
template = genTemplate(imgs[i])
template = mat2Edges(template)
h_templ, w_templ = template.shape[:2]
# Apply template Matching
margin_top = margin_bottom = h_templ; margin_left = margin_right = 0
img = addBlackMargins(imgs[i+1],margin_top, margin_bottom, margin_left, margin_right) # we need to enlarge the input image prior to call matchTemplate (template needs to be strictly smaller than the input image)
img = mat2Edges(img)
res = cv2.matchTemplate(img,template,cv2.TM_CCOEFF) # matching function
_, _, _, templ_pos = cv2.minMaxLoc(res) # minMaxLoc gets the best match position
# as we added margins to the input image we need to subtract the margins width to get the template position relatively to the initial input image (without the black margins)
rectified_templ_pos = (templ_pos[0]-margin_left, templ_pos[1]-margin_top)
templates_loc.append(rectified_templ_pos)
print("max_loc", rectified_templ_pos)
def stitchImages(imgs, templates_loc):
y_offset, h_final, w_final = calcFinalImgSize(imgs, templates_loc) # we calculate the "surface" needed to stitch all the images into one mat (and y_offset, the Y offset of the first image to be stitched)
result = np.zeros((h_final, w_final, 3), np.uint8)
#initial stitch
h_init, w_init = imgs[0].shape[:2]
result[y_offset:y_offset+h_init, 0:w_init] = imgs[0]
origin = (y_offset, 0) # top-left corner of the last stitched image (y,x)
# stitching loop
for j in range(0,len(templates_loc)):
h, w = imgs[j].shape[:2]
h2, w2 = imgs[j+1].shape[:2]
# we compute the coordinates where to stitch imgs[j+1]
y1 = origin[0] - templates_loc[j][1]
y2 = origin[0] - templates_loc[j][1] + h2
x_templ = int(float(w)*(1-H_templ_ratio)) # x-coordinate of the template relatively to its original image's right side
x1 = origin[1] + x_templ - templates_loc[j][0]
x2 = origin[1] + x_templ - templates_loc[j][0] + w2
result[y1:y2, x1:x2] = imgs[j+1] # we copy the input image into the result mat
origin = (y1,x1) # we update the origin point with the last stitched image
return(result)
if __name__ == '__main__':
# input images
part1 = cv2.imread('part1.jpg')
part2 = cv2.imread('part2.jpg')
part3 = cv2.imread('part3.jpg')
imgs = [part1, part2, part3]
H_templ_ratio = 0.45 # H_templ_ratio: horizontal ratio of the input that we will keep to create a template
templates_loc = [] # templates location
matchImages(imgs, templates_loc)
result = stitchImages(imgs, templates_loc)
cv2.imshow("result", result)