I've been working on a part of my app for the past few days where I need to simultaneously play and record an audio file. The task I need to accomplish is just to compare the recording to the audio file played and return a matching percentage. Here's what I have done so far and some context to my questions:
The target API is >15
I decided to use a .wav audio file format to simplify decoding the file
And below are a few questions that I have:
Am I going about this the right way or am I missing something?
In apps like Shazam, Midomi audio matching is done using technique called audio-fingerprinting which uses spectrogram and hashing.
It is somewhat detailed process and you can find more explanation in this link http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf
There are some libraries that can do it for you dejavu (https://github.com/worldveil/dejavu) and chromaprint (Its in c++). Musicg by google is in java, but it don't perform well with background noise.
Matching two audio files is a complicated process, and like above comments I will also tell you to try first on PC then on phones.