nlptext-to-speechcmusphinxfestivalhtk

Toolkits to design a TTS (Text-to-speech) system for a custom language?


I'd like to create a TTS system for a native american language (wayuunaiki). The language is written in latin (western) alphabet. I also have information about the phonetics (the rules to convert each word into IPA symbols).

I'm planning to create a database of voice recordings from the native people. Then I want to somehow train that data, using the IPA equivalency information to generate a more accurate speech model.

I'm totally new to Natural Language Processing, so my question is.. which tools can I use to perform what I'm planning?

I've heard that HTK ans CMU Sphinx are quite good in speech recognition. No idea about speech generation. Also heard about Festival, but i read it only uses predefined most known languages: English, Spanish, and so.

Excuse my typing faults. I'm still learning English. Thanks in advance!


Solution

  • You can add new language in Festival, it's actually specifically designed to simplify new language creation. For more details read the festvox book:

    http://festvox.org/bsv/

    Another toolkit to consider is OpenMary, see their documentation too

    https://github.com/marytts/marytts/wiki/New-Language-Support

    It is more modern and might be easier for you.

    In any case you will have to spend some time and write the code to describe your language. Usually it's about 300 lines of code. After that you can record single-speaker TTS database and run voice building process. The more you record the better the result would be.