pythonopencvtesseract

Text reading with Tesseract in a noisy image


I have these two images:

enter image description here

enter image description here

the first one has clearly an higher quality than the second one (even if it hasn't such a bad quality). I process the two images with OpenCV in order to read the text with Tesseract like that:

import tesseract
import cv2

img = cv2.cvtColor(scr_crop, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(img, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close

# result = cv2.resize(result, (0, 0), fx=2, fy=2)

text = pytesseract.image_to_string(result, lang="ita")

So I perform first a dilation and then an erosion for the gray-scaled versions of the two images obtaining these two results

enter image description here

enter image description here

So, as you can see, for the first image I obtain a great result and tesseract is able to read the text while I obtain a bad result for the second image and tesseract is not able to read the text. How can I improve the quality of the second image in order to obtain a better result for tesseract?


Solution

  • For the second image, just apply only thresholding with different threshold types.

    Instead of cv2.THRESH_BINARY, use cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU

    Image will become:

    enter image description here

    and if you read:

    txt = pytesseract.image_to_string(threshold)
    print(txt)
    

    Result will be:

    Esiti Positivi: 57
    
    Esiti Negativi: 1512
    Numerosita: 1569
    
    Tasso di Conversione: 3.63%
    

    Now what does cv2.THRESH_BINARY_INV and cv2.THRESH_OTSU means?

    cv2.THRESH_BINARY_INV is the opposite operation of the cv2.THRESH_BINARY if the current pixel value is greater than the threshold set to the 0. maxval ((255 in our case), otherwise.

    enter image description here

    source

    cv2.THRESH_OTSU finds the optimal threshold value using the OTSU's algorithm. [page 3]

    Code:

    import cv2
    import pytesseract
    
    img = cv2.imread("c7xq9.png")
    gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    thr = cv2.threshold(gry, 220, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
    txt = pytesseract.image_to_string(thr)
    print(txt)
    cv2.imshow("thr", thr)
    cv2.waitKey(0)