javaspeech-recognitioncmusphinxsphinx4

Getting started with Speech Recognition and Sphinx


Sphinx seems to be only real option for Java speech recognition. Documentation is sparse and it requires a high-level of domain knowledge. I used their example of a starting program and it works for one file and not for another, extremely similar, file. What is the difference? What is the secret to getting it to work more accurately.

This file, https://www.opdsupport.com/downloads/miscellaneous/sample-audio-files/52-welcome-wav/download works, but this one, https://www.opdsupport.com/downloads/miscellaneous/sample-audio-files/49-longwelcome-wav/download does not.
I noticed that the non-working file had a different sample rate, so I used a program to convert it to 16000, but still no luck


Solution

  • Make sure to inspect the file carefully. According to the docs your file must be either 8khz or 16khz and mono only. There are many tools available to do this -- I use audacity, but probably overkill for just a basic conversion like this.