pythonspeech-recognitionpocketsphinx

Python Pocketsphinx: Keyword not being recognised from a .wav file


I'm trying to detect the keyword temperature from a recording of me only saying the phase temperature (there are no other words present). Originally I used the keyword hello and it worked fine but whenever I try with any other word it does not. My current code is as follows:

import pocketsphinx as ps
import requests
import json
import sys, os

model_path = ps.get_model_path()
data_path = ps.get_data_path()

# Call to API
def get_temperature():
    headers = {
        'accept': 'application/json',
        'x-api-key': 'REMOVED'
    }

    response = requests.get(url=TEMPERATURE_URL, headers=headers)
    print("Response Code: ", response)

    temperature_data = response.json()
    print(temperature_data)
    temp = temperature_data[0]["value"]
    return temp

print("start")
while True:
    speech = ps.AudioFile(lm=False, kws='keyphrase.list', kws_threshold=1e-1)
    for phrase in speech:
        print("--------------------------------------------------------------")
        print(phrase.segments(detailed=True))
        print(phrase)
        if phrase.__eq__('temperature '):
            print("if equal")
            temperature = get_temperature()
            print("Temperature: ", temperature)

The contents of my keyphrase.list file is:

temperature /1e-1/

It currently starts and runs but doesn't detect anything.

Edit: Here is the audio file I am using


Solution

  • Your file has wrong format:

    file client_audio.wav 
    client_audio.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, 4 channels 20000 Hz
    

    You have to convert it to proper format 16 bit mono 16khz before decoding, it wouldn't work otherwise.

    In case threshold is small, you can try different threshold values like 1e-10, 1e-20, 1e-30, 1e-40 to balance between detections and false alarms.