iosswiftspeech-to-textspeech-recognition-api

How To Make iOS Speech-To-Text Persistent


I am conducting initial research on a new potential product. Part of this product requires Speech-To-Text on both iPhones and iPads to remain on until the user turns it off. Upon using it myself, I noticed that it either automatically shuts off after 30 or so seconds, regardless of whether or not the user has stopped speaking, OR it shuts off after there have been a certain amount of questionable words from the speaker. In any case, this product requires it to remain on all of the time until explicitly told to stop. Has anybody worked with this before? And yes, I have tried a good search, I couldn't seem to find anything of substance, and especially anything written in the right language. Thanks friends!


Solution

  • import Speech
    
    let recognizer = SFSpeechRecognizer()
    let request = SFSpeechURLRecognitionRequest(url: audioFileURL)
    #if targetEnvironment(simulator)
      request.requiresOnDeviceRecognition = /* only appears to work on device; not simulator */ false
    #else
      request.requiresOnDeviceRecognition = /* only appears to work on device; not simulator */ true
    #endif
    recognizer?.recognitionTask(with: request, resultHandler: { (result, error) in
     print (result?.bestTranscription.formattedString)
    })
    

    The above code snippet, when run on a physical device will continuously ("persistently") transcribe audio using Apple's Speech Framework.

    The magic line here is request.requiresOnDeviceRecognition = ...

    If request.requiresOnDeviceRecognition is true and SFSpeechRecognizer#supportsOnDeviceRecognition is true, then the audio will continuously be transcribed until battery dies, user cancels transcription, or some other error/terminating condition occurs. This is at least true in my trials.

    Docs:

    https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio