[SOLVED] How to convert speech to text in python

How to convert speech to text in python - opus file format

I have some .opus audio files that need to be converted to text in order to run some analytics. I am aware that there is the Python SpeechRecognition package that can do this with .wav files as demonstrated in this tutorial.

Does anyone know how to convert .opus files to text, or convert .opus to .wav?

I have tried the Python SpeechRecognition package with no success.

Solution

Here is a solution which employs ffmpeg and the os library to first convert all .opus files in the specified directory to .wav, and then perform speech recognition on the resulting .wav files using the speech_recognition module:

Solution

import os
import speech_recognition as sr

path = './audio-files/'
file_type_to_convert = ".opus"
file_type_to_recognize = ".wav"

for filename in os.listdir(path):
    if filename.endswith(file_type_to_convert):
        os.system("ffmpeg -i \"{}\" -vn \"{}\"".format(path + filename,
                                                       path + filename[:-len(file_type_to_convert)] +
                                                       file_type_to_recognize))
recognizer = sr.Recognizer()  # Instantiate recognizer
rec_output = {}  # Create list to store output of speech recognized files

# Iterate over each file of specified type to be recognized
for file_to_recognize in os.listdir(path):
    if file_to_recognize.endswith(file_type_to_recognize):
        audio = sr.AudioFile(path + file_to_recognize)
        with audio as source:
            audio_data = recognizer.record(audio)
        # Recognize & append output
        # Note: google recognizer is online only, sphinx is the only offline option which uses CMU Sphinx engine
        rec_output[file_to_recognize[:-len(file_type_to_recognize)]] = recognizer.recognize_google(audio_data,
                                                                                                   language='en-US')

# Display each file's output
for key, val in rec_output.items():
    print(key)
    print(val)
    # Output: 
    # File name
    # Recognized words in each file