javawavmfccdtwtarsosdsp

Difficulties using TARSOS DSP to extract MFCC from WavFiles JAVA


I am attempting to use the TARSOS DSP library to extract the MFCC values from wav files, before using DTW to calculate the distance between them.

Unfortunately I am having trouble undesrtanding how the code from the MFCC class can be used on a wav file.

I am unsure If I need to convert the wav file into some sort of array buffer first.

Please see the code from the library for the MFCC class at this link.

https://github.com/JorenSix/TarsosDSP/blob/master/src/core/be/tarsos/dsp/mfcc/MFCC.java

If I could get advice about how to properly use this code to get MFCC values from a wav file, or perhaps reccomendattions about another method, I would greatly appreciate it.


Solution

  • This is sample code should do the job for small files. It loads the whole .wav file into a byte array so this is not right approach for big files. The final variables should probably be changed according to your use case. I'm still new to java so there's no guarantee that this is the best approach.

    public class App {
    private final static String pathToFile = "D:\\TarsosWavTest\\wavs\\1000HzTone.wav";
    private final static int audioBufferSize = 2048;
    private final static int bufferOverlap = 1024;
    private final static int amountOfMelFilters = 20;
    private final static int amountOfCepstrumCoef = 30;
    private final static float lowerFilterFreq = 133.33f;
    private final static float upperFilterFreq = 8000f;
    
    public static void main(String[] args) {
        File file = new File(pathToFile);
        AudioInputStream audioInputStream;
        byte[] byteAudioArray;
        AudioDispatcher audioDispatcher;
    
        try {
            audioInputStream = AudioSystem.getAudioInputStream(file);
            byteAudioArray = audioInputStream.readAllBytes();
        } catch (Exception e) {
            System.out.println("Exception occured");
            e.printStackTrace();
            return;
        }
    
        try {
            audioDispatcher = AudioDispatcherFactory.fromByteArray(byteAudioArray, audioInputStream.getFormat(),
                    audioBufferSize, bufferOverlap);
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }
    
        final MFCC mfccProcessor = new MFCC(audioBufferSize, audioInputStream.getFormat().getSampleRate(),
                amountOfCepstrumCoef, amountOfMelFilters, lowerFilterFreq, upperFilterFreq);
    
        audioDispatcher.addAudioProcessor(mfccProcessor);
        audioDispatcher.addAudioProcessor(new AudioProcessor() {
    
            @Override // gets called on each audio frame
            public boolean process(AudioEvent audioEvent) {
                float[] mfccs = mfccProcessor.getMFCC();
                /*  do whatever necessary with the mfcc elements here
                    e.g print them  */
                //System.out.println(Arrays.toString(mfccs));
                return true;
            }
    
            @Override // gets called when end of the audio file was reached
            public void processingFinished() {
                System.out.println("end of file reached");
            }
        });
        audioDispatcher.run();// starts a new thread
    
    }}
    

    Please note that different libraries(e.g. librosa) are NOT guaranteed to compute the same MFCCs even with the same input parameters.