python opencv image-processing computer-vision mser

How to remove overlapping contours and separate each character as an individual contour for character extraction?

I am trying to implement character extraction from images in Python using the MSER in opencv. This is my code till now:

import cv2
import numpy as np

# create MSER object
mser = cv2.MSER_create()
# convert image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# detect the regions
regions,_ = mser.detectRegions(gray)
# find convex hulls of the regions
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
# initialize threshold area of the contours
ThresholdContourArea = 10000
# initialize empty list for the characters and their locations
char = []
loc =[]
# get the character part of the image and it's location if the area of contour less than threshold
for contour in hulls:
    if cv2.contourArea(contour) > ThresholdContourArea:
        continue
    # get the bounding rectangle around the contour
    bound_rect = cv2.boundingRect(contour)
    loc.append(bound_rect)
    det_char = gray[bound_rect[1]:bound_rect[1]+bound_rect[3],bound_rect[0]:bound_rect[0]+bound_rect[2]]
    char.append(det_char)

But this method gives multiple contours for the same letter and at some places multiple words are put into one contour. Here is an eg: original image:

After adding the contours:

Here the first T has multiple contours around and the two rs are combined into one contour. How do I prevent that?

Solution

Instead of using MSER, here's a simple approach using thresholding + contour filtering. We first remove the border then Otsu's threshold to obtain a binary image. The idea is that each letter should be an individual contour. We find contours and draw each rectangle.

Removed border -> binary image -> result

Note: In some cases, the letters are connected so to remove the merged characters, we can first enlarge the image using imutils.resize() then perform erosion or morphological opening to separate each character. However, I was unable to obtain great results since the text would disappear even with the smallest sized kernel.

Code

import cv2
import imutils

# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
image = imutils.resize(image, width=500)

# Remove border
kernel_vertical = cv2.getStructuringElement(cv2.MORPH_RECT, (1,50))
temp1 = 255 - cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel_vertical)
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,1))
temp2 = 255 - cv2.morphologyEx(image, cv2.MORPH_CLOSE, horizontal_kernel)
temp3 = cv2.add(temp1, temp2)
result = cv2.add(temp3, image)

# Convert to grayscale and Otsu's threshold
gray = cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Find contours and filter using contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(result, (x, y), (x + w, y + h), (36,255,12), 2)

cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()