I'm trying to implement a German command and control application with CMUSphinx and Java. So far, the application should recognize only a few words (numbers from 1 to 9, yes/no).
Unfortunately the accuracy is very bad. It seems, if a word is recognized correctly, it is only by chance.
Here is my java code so far (adapted from the tutorial):
public static void main(String[] args) throws IOException {
// Configuration Object
Configuration configuration = new Configuration();
// Set path to the acoustic model.
configuration.setAcousticModelPath("resource:/cmusphinx-de-voxforge-5.2");
// Set path to the dictionary.
configuration.setDictionaryPath("resource:/cmusphinx-voxforge-de.dic");
// use grammar
configuration.setGrammarPath("resource:/");
configuration.setGrammarName("dialog");
configuration.setUseGrammar(true);
LiveSpeechRecognizer recognizer = new LiveSpeechRecognizer(configuration);
recognizer.startRecognition(true);
SpeechResult result;
while ((result = recognizer.getResult()) != null) {
System.out.format("Hypothesis: %s\n", result.getHypothesis());
}
recognizer.stopRecognition();
}
Here is my grammer file:
#JSGF V1.0;
grammar dialog;
public <digit> = 1 | 2 | 3 | 4 |5 | 6 | 7 | 8 | 9 | ja | nein;
I've downloaded the German acoustic model and dictionary from here: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/German/
Is there something obvious I'm missing here? Where is the problem?
Thanks in advance and kind regards.
I have tried to use pocketsphinx with Eng and German model and accuracy is good when it comes with predefined/limited set of phrases! You can forget about general things like "could you please find me a restaurant in the downtown".
To achieve good accuracy with a pocketshinx:
You can search for Jasper project on GitLab to see how it's implemented. Also you can check the documentation