I have these two images:
the first one has clearly an higher quality than the second one (even if it hasn't such a bad quality). I process the two images with OpenCV in order to read the text with Tesseract like that:
import tesseract
import cv2
img = cv2.cvtColor(scr_crop, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(img, 220, 255, cv2.THRESH_BINARY)[1]
# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# Invert image to use for Tesseract
result = 255 - close
# result = cv2.resize(result, (0, 0), fx=2, fy=2)
text = pytesseract.image_to_string(result, lang="ita")
So I perform first a dilation and then an erosion for the gray-scaled versions of the two images obtaining these two results
So, as you can see, for the first image I obtain a great result and tesseract is able to read the text while I obtain a bad result for the second image and tesseract is not able to read the text. How can I improve the quality of the second image in order to obtain a better result for tesseract?
For the second image, just apply only thresholding
with different threshold types.
Instead of cv2.THRESH_BINARY
, use cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU
Image will become:
and if you read:
txt = pytesseract.image_to_string(threshold)
print(txt)
Result will be:
Esiti Positivi: 57
Esiti Negativi: 1512
Numerosita: 1569
Tasso di Conversione: 3.63%
Now what does cv2.THRESH_BINARY_INV
and cv2.THRESH_OTSU
means?
cv2.THRESH_BINARY_INV
is the opposite operation of the cv2.THRESH_BINARY
if the current pixel value is greater than the threshold set to the 0. maxval
((255 in our case), otherwise.
cv2.THRESH_OTSU
finds the optimal threshold value using the OTSU's algorithm. [page 3]
Code:
import cv2
import pytesseract
img = cv2.imread("c7xq9.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 220, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(thr)
print(txt)
cv2.imshow("thr", thr)
cv2.waitKey(0)