I have a solution (jsgf, dict, hmm), that works well with:
pocketsphinx_continuous -hmm zero_ru.cd_cont_4000 -dict vocabular.dict -jsgf calc.jsgf -inmic yes
Now I am trying to port it to Python pocketsphinx 0.1.15
(https://pypi.org/project/pocketsphinx/) and I see, in verbose output, that config of Python pocketsphinx
is not the same as pocketsphinx_continuous
config.
As a result Python pocketsphinx
makes a lot of wrong phantom detections.
My Python script is very simple:
speech = LiveSpeech(
verbose=True,
hmm='c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000',
lm=False,
jsgf='c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/calc.jsgf',
dic='c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/vocabular.dict',
allphone_ci=False,
vad_threshold=2.0,
kws_threshold=1.0,
)
for phrase in speech:
print(phrase)
By comparing text files with outputs I see, that pocketsphinx_continuous
has in output:
INFO: fe_interface.c(325): Using -1 as the seed.
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(154): Reading linear feature transformation from zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: zero_ru.cd_cont_4000/mdef
INFO: bin_mdef.c(181): Allocating 145321 * 8 bytes (1135 KiB) for CD tree
but Python pocketsphinx
has:
INFO: fe_interface.c(324): Using -1 as the seed.
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(152): Reading linear feature transformation from c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/mdef
Now I am trying to make Python pocketsphinx
to make (or, to config
) the same as pocketsphinx_continuous
.
How to make Python pocketsphinx
use CMN='current'
instead of CMN='batch'
?
I. e. how to make Python pocketsphinx
show in the output:
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(154): Reading linear feature transformation from zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: zero_ru.cd_cont_4000/mdef
instead of:
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(152): Reading linear feature transformation from c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/mdef
Batch and current are the same mode. It was just renamed during some point of the time and it simply depends on the version.
Phantom detections are probably a result of very small vocabulary in your jsgf, not the cmn.