androidvoice-recognitioncmusphinxpocketsphinx-android

Split hypothesis on individual keyphrases


I use Pocketsphinx in my Android app. I have a relatively small set of commands to be recognized independently, so I ended up using a keyword search from a file that looks like this:

one/1.0/
done/1.0/
recognition on/1e-10/
recognition off/1e-10/

The actual list is not in English so these keywords are chosen arbitrarily for the sake of the example. I realize that these thresholds may be somewhat less than optimal, and that short words are prone to mismatches.

The problem arises in this method:

@Override
public void onPartialResult(Hypothesis hypothesis) {
    if (hypothesis != null) {
        Log.d(
                "Sphinx",
                "\"" + hypothesis.getHypstr() + "\" recognized"
        );
    }
}

Note that some words sound pretty much alike. The thing is,

Unfortunately, I couldn't find any documentation on hypstr_get (I would appreciate if you could direct me to it) but effectively it seems to return a joined string of probable matches in increasing order of probability.

How can I retrieve actual commands from hypothesis? I can't just split hypothesis.getHypstr() by whitespace since some commands are keyphrases rather than keywords. I only want a single, most probable result.

Thanks.


Solution

  • You can iterate over segments, each would be a keyword

        for (Segment seg : recognizer.getDecoder().seg()) {
            System.out.println(seg.getWord() + " " + seg.getProb());
        }