pythonmfcc

Why does librosa librosa.feature.mfcc() spit out a 2D array?


Calling librosa.feature.mfcc() on an audio file spits out a 2D array like so:

array([[ -5.229e+02,  -4.944e+02, ...,  -5.229e+02,  -5.229e+02],
   [  7.105e-15,   3.787e+01, ...,  -7.105e-15,  -7.105e-15],
   ...,
   [  1.066e-14,  -7.500e+00, ...,   1.421e-14,   1.421e-14],
   [  3.109e-14,  -5.058e+00, ...,   2.931e-14,   2.931e-14]])

My question is what are these? Because I was expecting a 1D array of coefficients, why is it 2D? and what are the dimensions? Maybe this is my misunderstanding of what I should be getting back, however any explanation would be appreciated. I tried looking online but everyone seems to just know what it is.


Solution

  • One dimension is the time, the other one are the different frequencies. This link shows how it looks if you plot it:

    http://musicinformationretrieval.com/mfcc.html