cmusphinxsphinx4

sphinx-4 NullPointerException at startRecognition


I'm trying to follow this tutorial, and it crashes upon startup after having lots of problems with the dictionary and models, such as.

The dictionary is missing a phonetic transcription for the word 'humphrey'

and

Dec 18, 2014 1:14:50 PM edu.cmu.sphinx.linguist.lextree.HMMTree addPronunciation SEVERE: Missing HMM for unit T with lc=N rc=EH1 13:14:50.601 SEVERE lexTreeLinguist Bad HMM Unit: EH1

I loaded this dictionary and got the language and acoustic models from their SourceForge page

It then crashes with this:

Exception in thread "main" java.lang.NullPointerException
    at edu.cmu.sphinx.linguist.lextree.HMMNode.getBaseUnit(HMMTree.java:506)
    at edu.cmu.sphinx.linguist.lextree.HMMNode.<init>(HMMTree.java:484)
    at edu.cmu.sphinx.linguist.lextree.Node.addSuccessor(HMMTree.java:165)
    at edu.cmu.sphinx.linguist.lextree.HMMTree$EntryPoint.createEntryPointMap(HMMTree.java:1163)
    at edu.cmu.sphinx.linguist.lextree.HMMTree$EntryPointTable.createEntryPointMaps(HMMTree.java:1021)
    at edu.cmu.sphinx.linguist.lextree.HMMTree.compile(HMMTree.java:795)
    at edu.cmu.sphinx.linguist.lextree.HMMTree.<init>(HMMTree.java:716)
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.generateHmmTree(LexTreeLinguist.java:433)
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.compileGrammar(LexTreeLinguist.java:420)
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:337)
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:232)
    at edu.cmu.sphinx.decoder.AbstractDecoder.allocate(AbstractDecoder.java:92)
    at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:167)
    at edu.cmu.sphinx.api.LiveSpeechRecognizer.startRecognition(LiveSpeechRecognizer.java:46)
    at com.test.sphinxtest.App.main(App.java:25)

Here's my code.

package com.test.sphinxtest;

import java.io.IOException;

import edu.cmu.sphinx.api.Configuration;
import edu.cmu.sphinx.api.LiveSpeechRecognizer;
import edu.cmu.sphinx.api.SpeechResult;

/**
 * Hello world!
 *
 */
public class App 
{
    public static void main( String[] args )
    {
        Configuration configuration = new Configuration();

        configuration.setAcousticModelPath("models/acousticmodel/en-us");
        configuration.setDictionaryPath("dictionary/cmudict-0.6d");
        configuration.setLanguageModelPath("models/languagemodel/en-us.lm");

        try {
            LiveSpeechRecognizer recognizer = new LiveSpeechRecognizer(configuration);
            recognizer.startRecognition(true);
            SpeechResult result = recognizer.getResult();
            recognizer.stopRecognition();
            while ((result = recognizer.getResult()) != null) {
                System.out.println(result.getHypothesis());
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}

Solution

  • The correct dictionary should not have stress marks, you can download it from here:

    https://raw.githubusercontent.com/cmusphinx/pocketsphinx/master/model/en-us/cmudict-en-us.dict