pythonvosk

vosk translation output only shows last batch of translation


when I'm trying to run test_ffmpeg.py using my own audio file, it doesn't really show all the translations at the end. but it only shows the last paragraph/batch. like, my audio file goes as this:

The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy rat
The quick brown fox jumps over the lazy bat

what will happen is it will flash all these texts and it appears they are translated as expected, but at the end of the script, it only shows The quick brown fox jumps over the lazy bat as the final result

upon checking the actual file, this line should show all the translated text but it only prints the last part that was captured.

print(rec.FinalResult())

Solution

  • rec.FinalResult() is only intended to be used at the end of the file as indicated by the documentation.

    Returns speech recognition result. Same as result, but doesn't wait for silence You usually call it in the end of the stream to get final bits of audio. It flushes the feature pipeline, so all remaining audio chunks got processed.

    You are likely looking for a solution that appends each result to an array. You can then print this out or append it to a file.

    results = []
    subs = []
    while True:
       data = process.stdout.read(4000)
       if len(data) == 0:
           break
       if rec.AcceptWaveform(data):
           results.append(rec.Result())
    results.append(rec.FinalResult())