python-3.xnumpypython-imaging-librarypython-tesseract

I need to make pytesseract.image_to_string faster


i'm capturing the screen and then reading text from it using tesseract to transform it to a string the problem is that it's to slow for what i need i'm doing about 5.6fps and I needed more like 10-20.(i didn't put the imports i used because u can just see them in the code)

i tried everithing i know and nothing helped

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

time.sleep(7)

def getDesiredWindow():
    """Returns the top-left and bottom-right of the desired window."""
    print('Click the top left of the desired region.')
    pt1 = detectClick()
    print('First position set!')
    time.sleep(1)
    print('Click the bottom right of the desired region.')
    pt2 = detectClick()
    print('Got the window!')
    return pt1,pt2

def detectClick():
    """Detects and returns the click position"""
    state_left = win32api.GetKeyState(0x01)
    print("Waiting for click...")
    while True:
        a = win32api.GetKeyState(0x01)
        if a != state_left: #button state changed
            state_left = a
            if a < 0:
                print('Detected left click')
                return win32gui.GetCursorPos()


def gettext(pt1,pt2):
    # From the two input points, define the desired box
    box = (pt1[0],pt1[1],pt2[0],pt2[1])
    image = ImageGrab.grab(box)
    return pytesseract.image_to_string(image)
"""this is the part where i need it to be faster"""

Solution

  • Hi my solution was to make the image smaller.

    Yes it might affect the image_to_string result and make it inaccurate but in my case since my images were 1500 width I managed to get 3x speed with this. Try to change basewidth and try again:

    from PIL import Image
    
    basewidth = 600
    img = Image.open('yourimage.png')
    wpercent = (basewidth/float(img.size[0]))
    hsize = int((float(img.size[1])*float(wpercent)))
    img = img.resize((basewidth, hsize), getattr(Image, 'ANTIALIAS', 'LANCZOS'))
    img.save('yourimage.png')