pythonaudiobeat-detection

Audio spectrum extraction from audio file by python


Sorry if I submit a duplicate, but I wonder if there is any lib in python which makes you able to extract sound spectrum from audio files. I want to be able to take an audio file and write an algoritm which will return a set of data {TimeStampInFile; Frequency-Amplitude}.

I heard that this is usually called Beat Detection, but as far as I see beat detection is not a precise method, it is good only for visualisation, while I want to manipulate on the extracted data and then convert it back to an audio file. I don't need to do this real-time.

I will appreciate any suggestions and recommendations.


Solution

  • You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav

    from scipy.io import wavfile # scipy library to read wav files
    import numpy as np
    
    AudioName = "vignesh.wav" # Audio File
    fs, Audiodata = wavfile.read(AudioName)
    
    # Plot the audio signal in time
    import matplotlib.pyplot as plt
    plt.plot(Audiodata)
    plt.title('Audio signal in time',size=16)
    
    # spectrum
    from scipy.fftpack import fft # fourier transform
    n = len(Audiodata) 
    AudioFreq = fft(Audiodata)
    AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
    MagFreq = np.abs(AudioFreq) # Magnitude
    MagFreq = MagFreq / float(n)
    # power spectrum
    MagFreq = MagFreq**2
    if n % 2 > 0: # ffte odd 
        MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
    else:# fft even
        MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2 
    
    plt.figure()
    freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
    plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
    plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');
    
    
    #Spectrogram
    from scipy import signal
    N = 512 #Number of point in the fft
    f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
    plt.figure()
    plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
    #plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
    plt.ylabel('Frequency [Hz]')
    plt.xlabel('Time [seg]')
    plt.title('Spectrogram with scipy.signal',size=16);
    
    plt.show()
    

    i tested all the code and it works, you need, numpy, matplotlib and scipy.

    cheers