pythonopencvcomputer-visionbounding-boxmser

Improve text area detection (OpenCV, Python)


I am working on a project which ask me to detect text area in an image. This is the result I achieved until now using the code below.

Original Image original

Result Result

The code is the following:

import cv2
import numpy as np

# read and scale down image
img = cv2.pyrDown(cv2.imread('C:\\Users\\Work\\Desktop\\test.png', cv2.IMREAD_UNCHANGED))

# threshold image
ret, threshed_img = cv2.threshold(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY),
                                  127, 255, cv2.THRESH_BINARY)
# find contours and get the external one
image, contours, hier = cv2.findContours(threshed_img, cv2.RETR_TREE,
                                         cv2.CHAIN_APPROX_SIMPLE)

# with each contour, draw boundingRect in green
# a minAreaRect in red and
# a minEnclosingCircle in blue
for c in contours:
    # get the bounding rect
    x, y, w, h = cv2.boundingRect(c)
    # draw a green rectangle to visualize the bounding rect
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), thickness=1, lineType=8, shift=0)

    # get the min area rect
    #rect = cv2.minAreaRect(c)
    #box = cv2.boxPoints(rect)
    # convert all coordinates floating point values to int
    #box = np.int0(box)
    # draw a red 'nghien' rectangle
    #cv2.drawContours(img, [box], 0, (0, 0, 255))

    # finally, get the min enclosing circle
    #(x, y), radius = cv2.minEnclosingCircle(c)
    # convert all values to int
    #center = (int(x), int(y))
    #radius = int(radius)
    # and draw the circle in blue
    #img = cv2.circle(img, center, radius, (255, 0, 0), 2)

print(len(contours))
cv2.drawContours(img, contours, -1, (255, 255, 0), 1)

cv2.namedWindow('contours', 0)
cv2.imshow('contours', img)
while(cv2.waitKey()!=ord('q')):
    continue
cv2.destroyAllWindows()

As you can see, this can do more than I need. Look for commented parts if you need more.

By the way, what I need is to bound every text area in a single rectangle not (near) every char which the script is finding. Filter the single number or letter and to round everything in a single box.

For example, the first sequence in a box, the second in another one and so on.

I searched a bit and I found something about "filter rectangle area". I don't know if it is useful for my purpose.

Tooked a look also at some of the first result on Google but most of them don't work very well. I guess the code need to be tweaked a bit but I am a newbie in OpenCV world.


Solution

  • Solved using the following code.

    import cv2
    
    # Load the image
    img = cv2.imread('image.png')
    
    # convert to grayscale
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    
    # smooth the image to avoid noises
    gray = cv2.medianBlur(gray,5)
    
    # Apply adaptive threshold
    thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)
    thresh_color = cv2.cvtColor(thresh,cv2.COLOR_GRAY2BGR)
    
    # apply some dilation and erosion to join the gaps - change iteration to detect more or less area's
    thresh = cv2.dilate(thresh,None,iterations = 15)
    thresh = cv2.erode(thresh,None,iterations = 15)
    
    # Find the contours
    contours,hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
    
    # For each contour, find the bounding rectangle and draw it
    for cnt in contours:
        x,y,w,h = cv2.boundingRect(cnt)
        cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
        cv2.rectangle(thresh_color,(x,y),(x+w,y+h),(0,255,0),2)
    
    # Finally show the image
    cv2.imshow('img',img)
    cv2.imshow('res',thresh_color)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    Parameters that need to be modified to obtain the result below is numbers of iterations in erode and dilate functions. Lower values will create more bounding rectangles around (nearly) every digit/character.

    Result

    result