pythonmachine-learninglibrosaaudio-processing

Getting different background colour of spectrograph from audio reading


import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import librosa as lr

import glob

path = r'/content/drive/MyDrive/ESC-50/305 - Coughing/*.ogg'

a = glob.glob(path)

print(len(a))

for file in range(0,len(a),1):
  #scale, sr = librosa.load(a[file])
  #print(sr)


  scale, sr = librosa.load(a[file])
 
  mel_spectrogram = librosa.feature.melspectrogram( scale, sr=sr, n_fft=1024, hop_length=512, 
 
  n_mels=228
                                                   )
  mel_spectrogram.shape
  
  log_mel_spectrogram = librosa.power_to_db((mel_spectrogram))
  
  log_mel_spectrogram.shape

  plt.figure(figsize=(10, 5))
   
  librosa.display.specshow(log_mel_spectrogram, x_axis="time",
                            y_axis="log", 
                            
                            sr=sr)
  plt.colorbar(format="%+2.f")

  plt.show()

I am trying to read audio and convert it into mel spectrogram for the training of machine learning model but I am getting different spectrogram from the audio of the same size and have same sampling frequency for each audio I want to get spectrograph of same background so that I can get better accuracy for my machine learning model.

https://i.sstatic.net/beDR8.png


Solution

  • The values of your spectrogram looks reasonable, and to be generally in the same range for all the audio clips. But you have not specified the color map when plotting, so some of them have different color maps due to the autodetection in librosa. Specify cmap='magma' for librosa.display.specshow and that should not be a problem.

    Note that for machine learning, you should not use the plot of the spectrogram, but the spectrogram values directly. If you want an image representation of that, see https://stackoverflow.com/a/57204349/1967571