I'm looking for some C/C++ code for VAD (Voice Activity Detection).
Basically, my application is reading PCM frames from the device. I would like to know when the user is talking. I'm not looking for any speech recognition algorithm but only for voice detection.
I would like to know when the user is talking and when he finishes:
bool isVAD(short* pcm,size_t count);
There are open source implementations in the Sphinx and Freeswitch projects. I think they are all energy based detectors do won't need any kind model.
Sphinx 4 (Java but it should be easy to port to C/C++)