I'm trying to build an OCR model to scan a level questions. Was previously testing it out on this image here:
However, there seems to be a bug in the code or a problem with my laptop that just enlarged the image automatically, essentially zooming in onto a redundant point where the text is not so visible for the OCR to pick up.
Image after using Imread function
This is the code that I've used:
import cv2
import pytesseract as pyt
import easyocr
#read image
image = cv2.imread("test image 3.jpg")
#pytesseract executable
pyt.pytesseract.tesseract_cmd = "C:\\Users\\kobe4\\OneDrive\\Desktop\\AI\\python stuff\\tesseract\\tesseract.exe"
#preprocessing
#1.make the image gray
im_gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
im_rt = cv2.rotate(im_gray,cv2.ROTATE_90_COUNTERCLOCKWISE)
cv2.imshow('first image',im_rt)
cv2.waitKey(0)
Even by just using the imshow function on this default image that have been read from the imread function still causes the unexpected zoom. I don't know what caused this problem and hoping someone may guide me through this
I tried to resize the window through the opencv ResizeWindow function, I expected the entire image to compress onto the size I've set for the function but it just crop out a portion of the already zoomed in image and made it smaller. The same thing happened when I tried to reduce the image size
Image after grayscaled and resize
This was the image after I tried to resize both the window and image independently on 2 separate attempts (+ graysclaed it)
I'm making an assumption here that your original image is much larger than your screen resolution. By default, imshow
uses the WINDOW_AUTOSIZE
flag and is trying to display the image at its original size, which is limited by the display resolution, and results in the image appearing to be cropped. To get around that, you can first create the window with the WINDOW_NORMAL
flag before calling imshow
.
As for resizing the window, resizeWindow
does exactly that. It doesn't resize the image. Try the code below and see if that fixes your issue.
import cv2
image = cv2.imread("test image 3.jpg")
im_gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
im_rt = cv2.rotate(im_gray,cv2.ROTATE_90_COUNTERCLOCKWISE)
cv2.namedWindow('first_image', cv2.WINDOW_NORMAL)
cv2.resizeWindow('first_image', im_rt.shape[1]//4, im_rt.shape[0]//4)
cv2.imshow('first_image', im_rt)
cv2.waitKey(0)
cv2.destroyAllWindows()