pythonopencvroi

How to find the same ROI for the same text with different colors?


I am trying to find ROIs on these two images:

enter image description here

enter image description here

I'm using this code for image #1:

image_1 = image1
corr1 = []
gray = cv2.cvtColor(image_1, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (1,1), 1)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,10)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel, iterations=3)

cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
ROI_numbers1 = 0
ROI1 = []
for c in cnts:
    area = cv2.contourArea(c)
    if area > 5:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(image_1, (x, y), (x + w, y + h), (0,255,0), 1)
        ROI1.append(image_1[y:y+h, x:x+w])
        corr1.append([y,y+h, x,x+w])
        ROI_numbers1 += 1

And this code for image #2:

image_2 = image2
corr2 = []
gray = cv2.cvtColor(image_2, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (1,1), 1)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,10)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel, iterations=3)

cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
ROI_numbers2 = 0
ROI2 = []
for c in cnts:
    area = cv2.contourArea(c)
    if area > 5:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(image_2, (x, y), (x + w, y + h), (0,255,0), 1)
        ROI2.append(image_2[y:y+h, x:x+w])
        corr2.append([y,y+h, x,x+w])
        ROI_numbers2 += 1

After using OpenCV for displaying the ROIs, I am getting this:

enter image description here

Why is the ROI area for the blue text in image #1 less than for the white text in image #2?


Solution

  • When converting your image(s) to grayscale, you'll get different gray values for the white and blue texts. Thus, cv2.GaussianBlur will give different results, and following cv2.adaptiveThreshold, too. In the end, the found contours are different, following the ROIs.

    Don't convert to grayscale here! In your original, three-channel image, mask anything that's not the background, which is a solid gray (53, 53, 53). That mask replaces your thresh. From there, you can then use your existing implementation.

    Here's a minimal example to check, whether the resulting bounding rectangles (ROIs) are the same:

    import cv2
    import numpy as np
    
    
    def cnts_from_image(image):
        thresh = (~np.all(image == (53, 53, 53), axis=2)).astype(np.uint8) * 255
        kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
        dilate = cv2.dilate(thresh, kernel, iterations=3)
        cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        cnts = cnts[0] if len(cnts) == 2 else cnts[1]
        return cnts
    
    
    rects_white = [cv2.boundingRect(c) for c in cnts_from_image(cv2.imread('white_text.png'))]
    rects_blue = [cv2.boundingRect(c) for c in cnts_from_image(cv2.imread('blue_text.png'))]
    
    print('All rectangles identical:', np.all([rw == rb for rw, rb in zip(rects_white, rects_blue)]))
    # All rectangles identical: True
    
    ----------------------------------------
    System information
    ----------------------------------------
    Platform:      Windows-10-10.0.16299-SP0
    Python:        3.9.1
    NumPy:         1.20.2
    OpenCV:        4.5.1
    ----------------------------------------