google-cloud-visionvision-api

Updates in Google's vision API's response


I have been using Google's vision API for text detection for the past few months. The API returns a "map" of words present in a particular image/document. Each element in the "map" (JSON) would have the text of a word and its coordinates in the document

Earlier the mapping used to break up text to words on the basis of spaces and special characters... and now it seems that the text is broken to words based only on spaces.

For example a document with the text "Foo.Bar Hello World" would have 4 elements i.e. 4 words, because a "word" meant breaking text on special character too... but now "Foo.Bar Hello World" results only in 3 words.

My question is that Is there a way to choose which version of the API we can use? or Is there a way to track changes introduced to return structure of the API/the changes in the TEXT_DETECTION model?

What I have checked out:

  1. https://cloud.google.com/vision/docs/release-notes -> This contains releases for the API as a whole and not changes in the model running for the OCR or "post-processing" of the model's result.
  2. cloud-vision-discuss google group for similar issues.

Solution

  • We've faced exactly the same problem. Look at Google's answer. Unbelievable:

    We just received an update from the Vision API engineering team that they just recently released a new OCR model last week, and they advised us that the release notes will be published soon.

    The engineering team also informed us that they are aware of this issue and they are further investigating it.

    Therefore, I have asked them to update us on this quality regression of the OCR model. We also asked them about the possibility of using the previous model version.

    There is no ETA for when the Vision API team would get back to us with their response, but please expect us to update you by the end of this week at the latest or as soon as they update us.