iosswiftios-vision

Swift iOS - Vision does not return any observations from cgImage


I'm attempting to program an app that allows the user to trace Japanese characters, which will then use Vision to determine whether they have been traced correctly. To test this, I'm attempting to use English and Japanese characters in my tests, but neither seem to return any observations and therefore no recognised strings.

import UIKit
import Vision

func convertCanvasToImage(view: UIView) -> UIImage {
    let renderer = UIGraphicsImageRenderer(size: view.bounds.size)
    return renderer.image { ctx in
        view.drawHierarchy(in: view.bounds, afterScreenUpdates: true)
    }
}

func runVisionRecognition(canvas: Canvas) {

    NSLog("Start runVisionRecognition")
    let uiImage = convertCanvasToImage(view: canvas)
    guard let cgImage = uiImage.cgImage else { return }

    let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
    let request = VNRecognizeTextRequest(completionHandler: recognizeTextHandler)
    request.recognitionLevel = .accurate
    request.recognitionLanguages = ["en-US"]
    //request.minimumTextHeight = 0.1
    request.usesLanguageCorrection = true
    // request.maximumRecognitionCandidates = 10
    
    do {
        try requestHandler.perform([request])
    } catch {
        NSLog("Uh oh! \(error).")
    }
}

func recognizeTextHandler(request: VNRequest, error: Error?) {
    guard let observations =
            request.results as? [VNRecognizedTextObservation] else {
        NSLog("Whoops, observations like \(request.results)")
        return
    }
    let recognizedStrings = observations.compactMap { observation in
        return observation.topCandidates(1).first?.string
    }
    
    NSLog("Observation: \(observations)")
    NSLog("Recognised Strings: \(recognizedStrings)")
}

My characters come from a Canvas that I then translate into an image to be fed into Vision (see the top function).

Below are some examples of my handwriting on the Canvas. NSLog returns 'Observation: [], Recognised Strings: []', but as you can see, the Photos app seems to recognise the inputs fine!

View of my app, with the letters drawn onto a Canvas

Screenshot from the Photos app, showing the letters being read clearly (both English and Japanese)

My theory is something is going wrong between the conversion from Canvas to cgImage, but if you look in the first image, the top 'APPLE' is a converted image from my bottom drawing of 'APPLE'.


Solution

  • I managed to solve this myself - irritatingly, the problem was with the CanvasWrapper all along.

    struct CanvasWrapper: UIViewRepresentable {
        @Binding var canvas: Canvas
        
        func makeUIView(context: Context) -> UIView {
            return canvas
        }
        
        func updateUIView(_ uiView: UIView, context: Context) {
            uiView.backgroundColor = .white
        }
        
        func makeCoordinator() -> Coordinator {
            Coordinator(self)
        }
        
        class Coordinator: NSObject {
            var parent: CanvasWrapper
    
            init(_ parent: CanvasWrapper) {
                self.parent = parent
            }
        }
    }
    

    uiView.backgroundColor was set to .clear - this causes problems when converting it to an image I believe! Setting it to .white allowed the recognition to work properly.