I am using the following code to do OCR using Google Vision AI:
from google.cloud import vision
client = vision.ImageAnnotatorClient()
bindata = base64.b64decode(b64data) # b64data is a Base64 encoded image file
image = vision.Image(content=bindata)
results = client.text_detection(image=image)
Which works fine for most images. I found now an image where text is not detected, but when I rotate the image, it does get detected.
So, I am looking for extra options I could possibly give the text_detection
method, but I can nowhere find good reference documentation on it. Even the official Google documentation does not include that method.
Where can I find the right documentation for the text_detection
method? (Or, better, how can I improve the OCR detection to correctly detect upside down text).
Google's documentation Detect text in Images references text_detection
but it's unclear to me where this method is defined even though it's available on vision_v1.ImageAnnotatorClient
and not obviously documented in the SDK, APIs Explorer nor googleapis
(gRPC) repo.
Subsequently on the page, the documentation uses APIs Explorer and annotate_image
which supports providing features.
See the gRPC Message
called Feature
as defined in the Google's googleapis
repo.
from google.cloud import vision
from google.cloud.vision_v1.types import AnnotateImageRequest,Image,ImageContext, Feature, TextDetectionParams
client = vision.ImageAnnotatorClient()
try:
with open("question.png", "rb") as image_file:
content = image_file.read()
except FileNotFoundError:
print("Image file not found.")
content = b""
image = Image(content=content)
features = [
Feature(
model="builtin/latest",
type_=Feature.Type.DOCUMENT_TEXT_DETECTION,
),
]
image_context = ImageContext(
language_hints=["en"],
text_detection_params=TextDetectionParams(
enable_text_detection_confidence_score=True
),
)
rqst = AnnotateImageRequest(
image=image,
features=features,
image_context=image_context,
)
resp = client.annotate_image(request=rqst)
print(resp)
question.png
is a screenshot of your question.
Yields:
text_annotations { ... }
...
full_text_annotation {
pages { ... }
text: "Google Vision Al does not include correct documentation\nAsked today Modified today Viewed 21 times\nPart of Google Cloud Collective\nVote\nI am using the following code to do OCR using Google Vision Al:\nfrom google.cloud import vision\nclient = vision. ImageAnnotatorClient()\nbindata = base64. b64decode (b64data)\nimage = vision. Image (content=bindata)\n#b64data is a Base64 encoded image file\nresults = client.text_detection (image-image)\nWhich works fine for most images. I found now an image where text is not detected, but when I\nrotate the image, it does get detected.\nSo, I am looking for extra options I could possibly give the text_detection method, but I can\nnowhere find good reference documentation on it. Even the official Google documentation does\nnot include that method.\nWhere can I find the right documentation for the text_detection method? (Or, better, how can I\nimprove the OCR detection to correctly detect upside down text)."
}
...