Ok, now I am stuck up in converting mp3 to wav. I have seen different answers but i think i would to go for the one of pydub, which i already did using these few lines
from pydub import AudioSegment
AudioSegment.from_mp3("/input/file.mp3").export("/output/file.wav", format="wav")
but when I run the above code, i get the following error
C:\Python27\lib\site-packages\pydub-0.14.2-py2.7.egg\pydub\utils.py:165: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
Traceback (most recent call last): File "C:/Users/phourlhar/Desktop/VoiceDetector/yeah.py", line 7, in stereo_to_mono()
File "C:\Users\phourlhar\Desktop\VoiceDetector\utils.py", line 25, in stereo_to_mono
sound = AudioSegment.from_mp3(PROJECT_DIR+'\\files\\rec'+str(c)+'.mp3')
File "build\bdist.win32\egg\pydub\audio_segment.py", line 346, in from_file
File "C:\Python27\lib\subprocess.py", line 711, in init errread, errwrite)
File "C:\Python27\lib\subprocess.py", line 948, in _execute_child startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
I don't know why it raises this error as i am very sure the file exists. Although i have answers suggesting the installation of ffmpeg, but i dont know if affect the app deployment in any way later on
The pydub
module uses either ffmpeg
or avconf
programs to do the actual conversion. So you do have to install ffmpeg
to make this work.
But if you don't need pydub
for anything else, you can just use the built-in subprocess
module to call a convertor program like ffmpeg
like this:
import subprocess
subprocess.call(['ffmpeg', '-i', '/input/file.mp3',
'/output/file.wav'])
This requires that the ffmpeg binary is in a location in your $PATH, by the way.
Edit: With ffmeg
, you cannot convert stereo to mono, as far as I know. You can only choose the left or right channel. I'm assuming this is not what you want.
The sox
program can convert stereo to mono:
import subprocess
subprocess.call(['sox', '/input/file.mp3', '-e', 'mu-law',
'-r', '16k', '/output/file.wav', 'remix', '1,2'])
This will sample at 16 kHz, with 8 bits/sample, giving you 16 kb/s.