The official documentation for pyttsx3 gives a variation of the following example for printing words that are currently being said. The only difference is that the print statements are in Python 3.x syntax instead of Python 2.x.
import pyttsx3
def onStart(name):
print('starting', name)
def onWord(name, location, length):
print('word', name, location, length)
def onEnd(name, completed):
print('finishing', name, completed)
engine = pyttsx3.init()
engine.connect('started-utterance', onStart)
engine.connect('started-word', onWord)
engine.connect('finished-utterance', onEnd)
engine.say('The quick brown fox jumped over the lazy dog.')
engine.runAndWait()
The following incorrect output is printed.
starting None
word None 1 0
finishing None True
How can I print the actual word being uttered?
EDIT: If this task is not possible in pyttsx3, I am also open to using any other text to speech library to accomplish this.
The attribute name
is intended to be a tag that is added to an utterance. You have to set it yourself as the second, optional argument to say
, for example say("hello world", "introduction")
. In this case the value of name
in all of the callbacks will be introduction
. From the documentation:
say(text : unicode, name : string) → None
Queues a command to speak an utterance. The speech is output according to the properties set before this command in the queue. Parameters:
text – Text to speak.
name – Name to associate with the utterance. Included in notifications about this utterance.
You can use this by duplicating the actual text in the engine.say()
call, i.e., engine.say(sentence, sentence)
. Then you can use the location and length arguments, which are string indexes, to extract the actual word from the sentence and print it in the callback.
MCVE:
import pyttsx3
def onStart(name):
print('starting', name)
def onWord(name, location, length):
print('word', name[location:location+length], location, length)
def onEnd(name, completed):
print('finishing', name, completed)
engine = pyttsx3.init()
engine.connect('started-utterance', onStart)
engine.connect('started-word', onWord)
engine.connect('finished-utterance', onEnd)
sentence = 'The quick brown fox jumped over the lazy dog.'
engine.say(sentence, sentence)
engine.runAndWait()
Output:
starting The quick brown fox jumped over the lazy dog.
word The 0 3
word quick 4 5
word brown 10 5
word fox 16 3
word jumped 20 6
word over 27 4
word the 32 3
word lazy 36 4
word dog 41 3
finishing The quick brown fox jumped over the lazy dog. True
Note that each engine implements the callbacks separately. The above was tested with the espeak
engine on Linux, it might be that other engines for Windows and Mac implement it differently regarding the exposed information.