pythonpython-3.xspeech-recognitionspeech-to-texttranscription

Transcribing mp3 to text (python) --> "RIFF id" error


I am trying to turn mp3 file to text, but my code returns the error outlined below. Any help is appreciated!

This is a sample mp3 file. And below is what I have tried:

import speech_recognition as sr
print(sr.__version__)
r = sr.Recognizer()

file_audio = sr.AudioFile(r"C:\Users\Andrew\Podcast.mp3")

with file_audio as source:
    audio_text = r.record(source)

print(type(audio_text))
print(r.recognize_google(audio_text))

The full error I get. Appears to be:

Error: file does not start with RIFF id

Thank you for your help!


Solution

  • You need to first convert the mp3 to wav, and then you can transcribe it, below is the modified version of your code.

    import speech_recognition as sr
    from pydub import AudioSegment
    
    # convert mp3 file to wav  
    src=(r"C:\Users\Andrew\Podcast.mp3")
    sound = AudioSegment.from_mp3(src)
    sound.export("C:\Users\Andrew\podcast.wav", format="wav")
    
    file_audio = sr.AudioFile(r"C:\Users\Andrew\Podcast.wav")
    
    # use the audio file as the audio source                                        
    r = sr.Recognizer()
    with file_audio as source:
    audio_text = r.record(source)
    
    print(type(audio_text))
    print(r.recognize_google(audio_text))
    

    In above modified code, first mp3 file being converted into wav and then transcribing processes.