I use Pocketsphinx in my Android app. I have a relatively small set of commands to be recognized independently, so I ended up using a keyword search from a file that looks like this:
one/1.0/
done/1.0/
recognition on/1e-10/
recognition off/1e-10/
The actual list is not in English so these keywords are chosen arbitrarily for the sake of the example. I realize that these thresholds may be somewhat less than optimal, and that short words are prone to mismatches.
The problem arises in this method:
@Override
public void onPartialResult(Hypothesis hypothesis) {
if (hypothesis != null) {
Log.d(
"Sphinx",
"\"" + hypothesis.getHypstr() + "\" recognized"
);
}
}
Note that some words sound pretty much alike. The thing is,
"done one" recognized
"one done" recognized
Unfortunately, I couldn't find any documentation on hypstr_get
(I would appreciate if you could direct me to it) but effectively it seems to return a joined string of probable matches in increasing order of probability.
How can I retrieve actual commands from hypothesis
? I can't just split hypothesis.getHypstr()
by whitespace since some commands are keyphrases rather than keywords. I only want a single, most probable result.
Thanks.
You can iterate over segments, each would be a keyword
for (Segment seg : recognizer.getDecoder().seg()) {
System.out.println(seg.getWord() + " " + seg.getProb());
}