python-3.xspeech-recognitionpocketsphinx

Where to place trained Speech model


I have trained pocketsphinx using sphinxtrain. The problem I'm facing is that I don't know how to use the trained model in my code.

My first thought was to just replace the current model in the pocketsphinx library or to include the trained model somehow.

I have searched a lot but most of what I found was based around using tensorflow for training or Googles recognition software, but nothing about how to use a trained model.

Here is a basic example of how the code works:

import speech_recognition as sr


r = sr.Recognizer()
with sr.Microphone() as source:
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)



output = r.recognize_sphinx(audio)
print(output)

Solution

  • I solved the problem by using pocketsphinx's LiveSpeech()

    import os
    from pocketsphinx import LiveSpeech, get_model_path
    model_path = get_model_path()
    
    speech = LiveSpeech(
        verbose=False,
        sampling_rate=16000,
        buffer_size=2048,
        no_search=False,
        full_utt=False,
        hmm=os.path.join(model_path, 'en-us'),
        lm=os.path.join(model_path, 'en-us.lm.bin'),
        dic=os.path.join(model_path, 'cmudict-en-us.dict')
    )
    
    for phrase in speech:
        output = phrase.hypothesis()
    
        if output == 'hello':
            print("recognized")
            print(output)
        else:
            print("not recognized")
            print(output)
    

    In this example the output should look something like this for the if statement

    recognized
    hello
    

    and like this for the else statement

    not recognized
    hi