
Difference between TextToSpeech.addEarcon() and TextToSpeech.addSpeech()

I just started working on Android's TextToSpeech API and I came accros two methods which seem to be exactly the same namely:

TextToSpeech.addEarcon() : Adds a mapping between a string of text and a sound resource in a package. Use this to add custom earcons.


TextToSpeech.addSpeech() : Adds a mapping between a string of text and a sound resource in a package. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.


  • Earcons are not meant to convey detailed information but rather to provide auditory signals or cues.

    That is why both methods exist in the Android API: They don't serve the same purpose.

    Note: The sound added by either addSpeech() or addEarcon() cannot be played back as part of phrase/sentence. That is, it can only be played when passed as a single word as parameter. For example, for:

    tts.addSpeech("haha", "", R.raw.haha);

    the following works:

    tts.speak("haha", TextToSpeech.QUEUE_FLUSH, null);

    but the following won't work (by design)

    tts.speak("haha what a funny joke", TextToSpeech.QUEUE_FLUSH, null);

    Lastly, postpone this mapping of texts (i.e. via addSpeech() or addEarcon()) to the point TTS is successfully initialized (i.e. onInit()), as highlighted in this SO answer: