cmusphinxpocketsphinx

What does the parenthesis mean with a cmusphinx result?


My output is:

['<s>', 'does', 'any', '<sil>', 'unable', 'to(3)', 'bear', 'the', 'senate', 'is', 'touching', 'emotion', 'turned', 'away', '<sil>', 'and(2)', 'ill', 'afford', '<sil>', 'without', 'seeking', 'any', 'further', 'explanation', '<sil>', 'and(2)', 'attracted', 'towards(2)', 'him', 'and', 'irresistible', 'magnetism', 'which', 'draws', 'us', 'towards(2)', 'those', 'who', 'have', 'loved', 'to(3)', 'people', 'for(2)', 'whom', 'we', 'mourn', '<sil>', 'extended', 'his', 'hand', 'towards(2)', 'the(2)', 'young', 'man', '</s>']

I get what <s> and <sil> do. But what about to(3)?


Solution

  • It's hard to say with absolute certainty without checking the dictionary file (normally the file with .dict extension) which relates each word to its pronunciation. You could then check how different it is from (supposedly) to(2) or to. (Or even if those variations exist at all.)

    However, since many words with the same spelling have different pronunciations, the convention is to account for those with different symbols in the dictionary, like stated in the official tutorial.

    A dictionary can also contain alternative pronunciations. In that case you can designate them with a number in parentheses:

    the TH IH

    the(2) TH AH

    In the example above, the software would recognise differently according to the speaker having said it differently.

    If you're using a pre-made official model, then that's the case. Assuming you don't care so much about how it was pronounced and more about what it was pronounced, you can ignore the parenthesis.