javaaudiojavasoundtarsosdsp

TarsosDSP Pitch Analysis for Dummies


I am working on a program that analyzes the pitch of a sound file. I came across a good API called "TarsosDSP" which offers various pitch analysis. However, I am experiencing a lot of trouble setting it up. Can someone show me some quick pointers on how to use this API (especially the PitchProcessor class)? Code snippets would be appreciated because I am new at sound analysis.

EDIT: I found some document at http://husk.eecs.berkeley.edu/courses/cs160-sp14/index.php/Sound_Programming where there's some example code that shows how to setup the PitchProcessor, …

int bufferReadResult = mRecorder.read(mBuffer, 0, mBufferSize);
// (note: this is NOT android.media.AudioFormat)
be.hogent.tarsos.dsp.AudioFormat mTarsosFormat = new be.hogent.tarsos.dsp.AudioFormat(SAMPLE_RATE, 16, 1, true, false);
AudioEvent audioEvent = new AudioEvent(mTarsosFormat, bufferReadResult);
audioEvent.setFloatBufferWithByteBuffer(mBuffer);
pitchProcessor.process(audioEvent);

…I am lost, what exactly are mBuffer and mBufferSize? How do I find these values? And where do I input my audio files?


Solution

  • The basic flow of audio in the TarsosDSP framework is as follows: the incoming audio stream originating from an audio file or a microphone is read and chopped into frames of e.g. 1024 samples. Each frame travels through a pipeline that modifies or analyses (e.g. pitch analysis) it.

    In TarsosDSP the AudioDispatcher is responsible to chop the audio in frames. Also it wraps an audio frame into an AudioEvent object. This AudioEvent object is send through a chain of AudioProcessors.

    So in the code you quoted mBuffer is the audio frame, mBufferSize is the size of the buffer in samples. You can choose the buffer size yourself but for pitch detection 2048 samples is reasonable.

    For pitch detection you could do something like this with the TarsosDSP library:

       PitchDetectionHandler handler = new PitchDetectionHandler() {
            @Override
            public void handlePitch(PitchDetectionResult pitchDetectionResult,
                    AudioEvent audioEvent) {
                System.out.println(audioEvent.getTimeStamp() + " " pitchDetectionResult.getPitch());
            }
        };
        AudioDispatcher adp = AudioDispatcherFactory.fromDefaultMicrophone(2048, 0);
        adp.addAudioProcessor(new PitchProcessor(PitchEstimationAlgorithm.YIN, 44100, 2048, handler));
        adp.run();
    

    In this code first a handler is created which simply prints the detected pitch. The AudioDispatcher is attached to the default microphone and has a buffersize of 2048. An audio processor that detects pitch is added to the AudioDispatcher. The handler is used there as well.

    The last line starts the process.