I am trying to develop a meter reading detection system. This is the picture
I need to get the meter reading 27599 as the output. I used this code:
import pytesseract
import cv2
image = cv2.imread('read2.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
(H, W) = gray.shape
rectKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20, 7))
gray = cv2.GaussianBlur(gray, (1, 3), 0)
blackhat = cv2.morphologyEx(gray, cv2.MORPH_BLACKHAT, rectKernel)
res = cv2.threshold(blackhat, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
pytesseract.image_to_string(res, config='--psm 12 --oem 3 digits')
I get this output:
'.\n\n-\n\n3\n\n7\n\n7\n\n3\n\n-2105 566.261586\n\n161200\n\n310010\n\n--\n\n.-\n\n.\n\n5\n\x0c'
This is my first OCR project. Any help will be appreciated.
Well, there are a lot of texts there that can be removed before we start reading the actual meter number. On the other hand, we can limit our OCR to just numbers in order to prevent false positives (As a few 7-segment numbers are like alphabetical letters).
Since tesseract is not working well enough on 7-segment numbers. I will use EasyOCR.
So the procedure would be like this:
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
import easyocr
cropped = orig_image[300:850,:][:,200:680]
cropped_height, cropped_width, _ = cropped.shape
gray = cv.cvtColor(cropped, cv.COLOR_BGR2GRAY)
blurred = cv.GaussianBlur(gray, (17,17),0)
minDist = 100
param1 = 30
param2 = 50
minRadius = 100
maxRadius = 300
circle_img = cropped.copy()
circles = cv.HoughCircles(blurred, cv.HOUGH_GRADIENT, 1, minDist, param1=param1, param2=param2, minRadius=minRadius, maxRadius=maxRadius)
print(f"{len(circles)} circles detected", circles[0,:][0])
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
cv.circle(circle_img, (i[0], i[1]), i[2], (0, 255, 0), 2)
circle = circles[0,:][0]
circle_center = (circle[0], circle[1]) # x, y
circle_radius = circle[2]
color_cropped = cropped[circle_center[1] - circle_radius : circle_center[1],:]
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(color_cropped, allowlist ='0123456789')
if result:
print("detected number: ", result[0][1])
detected number: 27599