pythonazureocrazure-cognitive-services

I Don't get the expected result when calling ocr api from computer visio


I'm trying to use the ocr method from computer visio to extract all the text from a specific image. Nevertheless it doesn't return the info I know which is there, because when I analize the image directly in the available option in this page https://azure.microsoft.com/es-es/services/cognitive-services/computer-vision/, it does return the data.

This is the image im traying to get the data from https://bitbucket.org/miguel_acevedo_ve/python-stream/raw/086279ad6885a490e521785ba288914ed98cfd1d/test.jpg

I have followed all the python tutorial available in the azure documentation site.

import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from PIL import Image
from io import BytesIO

subscription_key = "<Subscription Key>"

assert subscription_key

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

ocr_url = vision_base_url + "ocr"

image_url = "https://bitbucket.org/miguel_acevedo_ve/python-stream/raw/086279ad6885a490e521785ba288914ed98cfd1d/test.jpg"

'''image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" + \
    "Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png"
'''

headers = {'Ocp-Apim-Subscription-Key': subscription_key}
params  = {'mode' : 'Printed'}
data    = {'url': image_url}
response = requests.post(ocr_url, headers=headers, params=params, json=data)
response.raise_for_status()

analysis = response.json()
print(analysis)

and this is my current output:

{u'regions': [], u'textAngle': 0.0, u'orientation': u'NotDetected', u'language': u'unk'}

UPDATE: The solution is to use recognizeText not the ocr function from computer visio.


Solution

  • The solution is to use the recognizeText method and not the ocr method from computer vision.

    First you need to send a post and then with the operationid make a get request to obtain de results.

    vision_base_url = "https://westeurope.api.cognitive.microsoft.com/vision/v2.0/"
    
    ocr_url = vision_base_url + "recognizeText"
    
    response = requests.post(
        ocr_url, headers=headers,params=params, data=imgByteArr)
    operationLocation = response.headers['Operation-Location']
    response = requests.request('GET', operationLocation, json=None, data=None, headers=headers, params=None)