pythonpython-tesseract

Getting text from low-quality gif file with tesseract


import pytesseract


import requests
from io import BytesIO

from PIL import Image,ImageOps

image_url = 'gif'

def optimize_and_ocr_from_url(image_url, tesseract_config="--psm 6"):
    # Download the image from the URL
    response = requests.get(image_url)
    image = Image.open(BytesIO(response.content))

    # Optimize the image (resize, convert to grayscale, etc.)
    optimized_image = optimize_image(image)

    # Use Tesseract OCR to extract text from the optimized image
    text = pytesseract.image_to_string(optimized_image, config=tesseract_config)

    # Print the OCR result to the console
    print("OCR Result:")
    print(text)

def optimize_image(image, target_resolution=(109, 50)):
    # Resize the image to the target resolution
    resized_image = image.resize(target_resolution)

    # Convert the image to grayscale
    grayscale_image = ImageOps.grayscale(resized_image)

    return grayscale_image

# Example usage
optimize_and_ocr_from_url(image_url)

I am trying to convert this type of images to text This is the type of image i am trying to convert to text

I tried to grayscale,tried to play with the tesseract_config but it doesn't seem to work.There is just empty response.I tried other solutions on stackoverflow but couldnt managed to do it. I use windows


Solution

  • I tried to use your gif directly with tesseract (instead of pytesseract) in console/terminal/bash (on Linux)

    $ tesseract input.gif stdout
    
    Warning: Invalid resolution 0 dpi. Using 70 instead.
    Estimating resolution as 270
    Empty page!!
    Estimating resolution as 270
    Empty page!!
    

    Next I tried to convert it with imagemagick and it seems it has few problems:

    $ convert input.gif -resize 200x24 -units pixelsperinch -density 72 -background white -flatten output.png
    

    output.png

    enter image description here

    Now it recognizes text but there is another problem:
    There is no top margin so it has problem to recognize 7 - it detects /

    +3/0 3/7 312208
    

    So I used -gravity south -extent 200x28 to add margin

    $ convert input.gif -resize 200x24 -units pixelsperinch -density 72 -background white -flatten -gravity south -extent 200x28 output-with-margin.png
    

    and now tesseract detects number correctly.

    output-with-margin.png (Stackoverflow has white background so you can't see new margin)

    enter image description here


    You can try to do the same with Pillow or you may use subprocess.run() to run directly imagemagick or you may try to use Python module Wand which uses imagemagick


    EDIT:

    It doesn't need top margin if I use digits, + and space

    $ tesseract output.png stdout -c tessedit_char_whitelist='+0123456789 '
    

    EDIT:

    I checked all options in tesseract and it has --dpi - and it can use gif with option --dpi 72 but not with --dpi 70 (maybe if you resize even more then 70 will work).
    But image still has to be resized to 200x24, and may need white background to recognize 7.
    I didn't test combination: with top margin and without background.

    $ tesseract output.gif stdout --dpi 72 -c tessedit_char_whitelist='+0123456789 '
    

    EDIT:

    In Tesseract documentation you can see Improving the quality of the output | tessdoc