[SOLVED] speechSynthesis not working on mobile Safari even though it's supported

speechSynthesis not working on mobile Safari even though it's supported

Im trying to use the speechSynthesis API. It's working on desktop browsers and mobile Chrome but not mobile Safari.

  const msg = new SpeechSynthesisUtterance("Hello World");
  window.speechSynthesis.speak(msg);

I added a little test and it seems the API is supported on Safari, could it be a permissions issue that it's not working?

  if ("speechSynthesis" in window) {
    alert("yay");
  } else {
    alert("no");
  }

Solution

On my end the issue broke down to proper loading speech synthesis on mobile Safari.

There are some things to check in order:

are voices loaded?
are voices even installed on your system?
is the utterance configured correctly?
is the speak function called from within a user interaction event?

The following example summarizes these checks and works on MacOS desktop Browsers plus iOS Safari:

let _speechSynth
let _voices
const _cache = {}

/**
 * retries until there have been voices loaded. No stopper flag included in this example. 
 * Note that this function assumes, that there are voices installed on the host system.
 */

function loadVoicesWhenAvailable (onComplete = () => {}) {
  _speechSynth = window.speechSynthesis
  const voices = _speechSynth.getVoices()

  if (voices.length !== 0) {
    _voices = voices
    onComplete()
  } else {
    return setTimeout(function () { loadVoicesWhenAvailable(onComplete) }, 100)
  }
}

/**
 * Returns the first found voice for a given language code.
 */

function getVoices (locale) {
  if (!_speechSynth) {
    throw new Error('Browser does not support speech synthesis')
  }
  if (_cache[locale]) return _cache[locale]

  _cache[locale] = _voices.filter(voice => voice.lang === locale)
  return _cache[locale]
}

/**
 * Speak a certain text 
 * @param locale the locale this voice requires
 * @param text the text to speak
 * @param onEnd callback if tts is finished
 */

function playByText (locale, text, onEnd) {
  const voices = getVoices(locale)

  // TODO load preference here, e.g. male / female etc.
  // TODO but for now we just use the first occurrence
  const utterance = new window.SpeechSynthesisUtterance()
  utterance.voice = voices[0]
  utterance.pitch = 1
  utterance.rate = 1
  utterance.voiceURI = 'native'
  utterance.volume = 1
  utterance.rate = 1
  utterance.pitch = 0.8
  utterance.text = text
  utterance.lang = locale

  if (onEnd) {
    utterance.onend = onEnd
  }

  _speechSynth.cancel() // cancel current speak, if any is running
  _speechSynth.speak(utterance)
}

// on document ready
loadVoicesWhenAvailable(function () {
 console.log("loaded") 
})

function speak () {
  setTimeout(() => playByText("en-US", "Hello, world"), 300)
}

<button onclick="speak()">speak</button>

Details on the code are added as comments within the snippet.