iosswiftios13visionkitios-vision

iOS Vision: VNRecognizedText boundingBox(for:) method returning identical bounding box for any range


I'm using the iOS Vision framework to perform OCR via a VNRecognizeTextRequest call, and I'm trying to locate each individual character in the resulting VNRecognizedText observations. However, when I call the boundingBox(for range: Range<String.Index>) method on any VNRecognizedText object and for any valid range within the recognized text, I get the same bounding box back. This bounding box corresponds to the bounding box of the entire string.

Am I misunderstanding the boundingBox(for:) method, or is there some other way to get discrete location info for single characters within a recognized text observation?

Thanks in advance!

Edit:

After looking into this more, I've realized that there's some sort of link with word groups and whitespace. Consider a recognized text observation with a string value of "Foo bar". Calling boundingBox(for:) for each character in "Foo" returns the exact same bounding box which, based on the dimensions, seems to correspond to the entire substring "Foo" instead of the single character whose range we pass into the boundingBox method. Then, in another bit of strange behavior, the boundingBox for the whitespace character is simply an empty region at the origin whose edges don't correspond with the substrings on either side of it. Finally, the behavior for the second substring is the same as the first: each character in "bar" has the same bounding box.


Solution

  • After hours of further investigation, I decided to get in touch with Apple Developer Tech Support. Sure enough, this is a bug! When VNRecognizeTextRequest.recognitionLevel is set to .accurate, as I had, the bug manifests. When recognitionLevel is set to .fast, the results behave as expected, with discrete bounding boxes per character.