pythonpocketsphinx

Python pocketsphinx 0.1.15 configuration versus pocketsphinx_continuous.exe configuration


I have a solution (jsgf, dict, hmm), that works well with:

pocketsphinx_continuous -hmm zero_ru.cd_cont_4000 -dict vocabular.dict -jsgf calc.jsgf -inmic yes

Now I am trying to port it to Python pocketsphinx 0.1.15 (https://pypi.org/project/pocketsphinx/) and I see, in verbose output, that config of Python pocketsphinx is not the same as pocketsphinx_continuous config.

As a result Python pocketsphinx makes a lot of wrong phantom detections.

My Python script is very simple:

speech = LiveSpeech(
    verbose=True,
    hmm='c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000',
    lm=False,
    jsgf='c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/calc.jsgf',
    dic='c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/vocabular.dict',
    allphone_ci=False,
    vad_threshold=2.0,
    kws_threshold=1.0,
)

for phrase in speech:
    print(phrase)

By comparing text files with outputs I see, that pocketsphinx_continuous has in output:

INFO: fe_interface.c(325): Using -1 as the seed.
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(154): Reading linear feature transformation from zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: zero_ru.cd_cont_4000/mdef
INFO: bin_mdef.c(181): Allocating 145321 * 8 bytes (1135 KiB) for CD tree

but Python pocketsphinx has:

INFO: fe_interface.c(324): Using -1 as the seed.
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(152): Reading linear feature transformation from c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/mdef

Now I am trying to make Python pocketsphinx to make (or, to config) the same as pocketsphinx_continuous.

How to make Python pocketsphinx use CMN='current' instead of CMN='batch'? I. e. how to make Python pocketsphinx show in the output:

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(154): Reading linear feature transformation from zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: zero_ru.cd_cont_4000/mdef

instead of:

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(152): Reading linear feature transformation from c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/feature_transform
INFO: mdef.c(518): Reading model definition: c:/Projects/pocketsphinx-5prealpha-win32/pocketsphinx/bin/Release/x64/zero_ru.cd_cont_4000/mdef

Solution

  • Batch and current are the same mode. It was just renamed during some point of the time and it simply depends on the version.

    Phantom detections are probably a result of very small vocabulary in your jsgf, not the cmn.