javascriptnode.jsaudiofrontendfrequency

How do i get the audio frequency from my mic using javascript?


I need to create a sort of like guitar tuner.. thats recognize the sound frequencies and determines in witch chord i am actually playing. Its similar to this guitar tuner that i found online: https://musicjungle.com.br/afinador-online But i cant figure it out how it works because of the webpack files..I want to make this tool app backendless.. Someone have a clue about how to do this only in the front end?

i founded some old pieces of code that doesnt work together.. i need fresh ideas


Solution

  • There are quite a few problems to unpack here, some of which will require a bit more information as to the application. Hopefully the sheer size of this task will become apparent as this answer progresses.

    As it stands, there are two problems here:

    need to create a sort of like guitar tuner..

    1. How do you detect the fundamental pitch of a guitar note and feed that information back to the user in the browser?

    and

    thats recognize the sound frequencies and determines in witch chord i am actually playing.

    2. How do you detect which chord a guitar is playing?

    This second question is definitely not a trivial one, but we'll come to it in turn. This is not a programming question, but rather a DSP question

    Question 1: Pitch Detection in Browser

    Breakdown

    If you wish to detect the pitch of a note in the browser there are a couple sub-problems that should be split up. Shooting from the hip we have the following JavaScript browser problems:

    This is not an exhaustive list, but it should consitute the bulk of the overall problem

    There is no Minimal, Reproducible Example, so none of the above can be assumed.

    Implementation

    A basic implementation would consist of a numeric reprenstation of a single fundamental frequency (f0) using an autocorrolation method outlined in the A. v. Knesebeck and U. Zölzer paper [1].

    There are other approaches which mix and match filtering and pitch detection algorithms which I believe is far outside the scope of a reasonable answer.

    NOTE: The Web Audio API is still not equally implemented across all browser. You should check each of the major browsers and make accomodations in your program. The following was tested in Google Chrome, so your mileage may (and likely will) vary in other browsers.

    HTML

    Our page should include

    A more rounded interface would likely split the operations of

    into separate interface elements, but for brevity they will be wrapped into a single element. This gives us a basic HTML page of

    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>Pitch Detection</title>
    </head>
    <body>
    <h1>Frequency (Hz)</h1>
    <h2 id="frequency">0.0</h2>
    <div>
        <button onclick="startPitchDetection()">
            Start Pitch Detection
        </button>
    </div>
    </body>
    </html>
    

    We are jumping the gun slightly with <button onclick="startPitchDetection()">. We will wrap up the operation in a single function called startPitchDetection

    Pallate of variables

    For an autocorrolation pitch detection approach our pallate of variables will need to include:

    giving us something like

    let audioCtx = new (window.AudioContext || window.webkitAudioContext)();
    let microphoneStream = null;
    let analyserNode = audioCtx.createAnalyser()
    let audioData = new Float32Array(analyserNode.fftSize);;
    let corrolatedSignal = new Float32Array(analyserNode.fftSize);;
    let localMaxima = new Array(10);
    const frequencyDisplayElement = document.querySelector('#frequency');
    

    Some value are left null as they will not be known until the microphone stream has been activated. The 10 in let localMaxima = new Array(10); is a little arbitrary. This array will store the distance in samples between consecutive maxima of the corrolated signal.

    Main script

    Our <button> element has an onclick function of startPitchDetection, so that will be required. We will also need

    However, the first thing we have to do is ask for permission to use the microphone. To achieve this we use navigator.mediaDevices.getUserMedia, which will returm a Promise. Embellishing on what is outlined in the MDN documentation this gives us something roughly looking like

    navigator.mediaDevices.getUserMedia({audio: true})
    .then((stream) => {
      /* use the stream */
    })
    .catch((err) => {
      /* handle the error */
    });
    

    Great! Now we can start adding our main functionality to the then function.

    Our order of events should be

    On top of that, add a log of the error from the catch method.

    This can then all be wrapped into the startPitchDetection function, giving something like:

    function startPitchDetection()
    {
        navigator.mediaDevices.getUserMedia ({audio: true})
            .then((stream) =>
            {
                microphoneStream = audioCtx.createMediaStreamSource(stream);
                microphoneStream.connect(analyserNode);
    
                audioData = new Float32Array(analyserNode.fftSize);
                corrolatedSignal = new Float32Array(analyserNode.fftSize);
    
                setInterval(() => {
                    analyserNode.getFloatTimeDomainData(audioData);
    
                    let pitch = getAutocorrolatedPitch();
    
                    frequencyDisplayElement.innerHTML = `${pitch}`;
                }, 300);
            })
            .catch((err) =>
            {
                console.log(err);
            });
    }
    

    The update interval for setInterval of 300 is arbitrary. A little experimentation will dictate which interval is best for you. You may even wish to give the user control of this, but that is outside the scope of thise question.

    The next step is to actually define what getAutocorrolatedPitch() does, so lets actually breakdown what autocorrolation is.

    Autocorrelation is the process of convolving a signal with itself. Any time the result goes from a positive rate of change to a negative rate of change is defined as a local maximum. The number of samples between the start of the corrolated signal to the first maximum should be the period in samples of f0. We can continue to look for subsequent maxima and take an average which should improve accuracy slightly. Some frequencies do not have a period of whole samples, for instance 440 Hz at a sample rate of 44100 Hz has a period of 100.227. We technichally could never accurately detect this frequency of 440 Hz by taking a single maximum, the result would always be either 441 Hz (44100/100) or 436 Hz (44100/101).

    For our autocorrolation function, we'll need

    Our function should first perform the autocorrolation, find the sample positions of local maximum and then calculate the mean distance between these maxima. This give a function looking like:

    function getAutocorrolatedPitch()
    {
        // First: autocorrolate the signal
    
        let maximaCount = 0;
    
        for (let l = 0; l < analyserNode.fftSize; l++) {
            corrolatedSignal[l] = 0;
            for (let i = 0; i < analyserNode.fftSize - l; i++) {
                corrolatedSignal[l] += audioData[i] * audioData[i + l];
            }
            if (l > 1) {
                if ((corrolatedSignal[l - 2] - corrolatedSignal[l - 1]) < 0
                    && (corrolatedSignal[l - 1] - corrolatedSignal[l]) > 0) {
                    localMaxima[maximaCount] = (l - 1);
                    maximaCount++;
                    if ((maximaCount >= localMaxima.length))
                        break;
                }
            }
        }
    
        // Second: find the average distance in samples between maxima
    
        let maximaMean = localMaxima[0];
    
        for (let i = 1; i < maximaCount; i++)
            maximaMean += localMaxima[i] - localMaxima[i - 1];
    
        maximaMean /= maximaCount;
    
        return audioCtx.sampleRate / maximaMean;
    }
    
    Problems

    Once you have implemented this you may find there are actually a couple of problems.

    The erratic result is down to the fact that autocorrolation by itself is not a perfect solution. You will need to experiment with first filtering the signal and aggregating other methods. You could also try limiting the signal or only analyse the signal when it is above a certain threshold. You could also increase the rate at which you perform the detection and average out the results.

    Secondly, the method for display is limited. Musicians would not be appreciative of a simple numerical result, rather, some kind of graphical feedback would be more intuitive. Again, that is outside the scope of the question.

    Full page and script
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>Pitch Detection</title>
    </head>
    <body>
    <h1>Frequency (Hz)</h1>
    <h2 id="frequency">0.0</h2>
    <div>
        <button onclick="startPitchDetection()">
            Start Pitch Detection
        </button>
    </div>
    <script>
        let audioCtx = new (window.AudioContext || window.webkitAudioContext)();
        let microphoneStream = null;
        let analyserNode = audioCtx.createAnalyser()
        let audioData = new Float32Array(analyserNode.fftSize);;
        let corrolatedSignal = new Float32Array(analyserNode.fftSize);;
        let localMaxima = new Array(10);
        const frequencyDisplayElement = document.querySelector('#frequency');
    
        function startPitchDetection()
        {
            navigator.mediaDevices.getUserMedia ({audio: true})
                .then((stream) =>
                {
                    microphoneStream = audioCtx.createMediaStreamSource(stream);
                    microphoneStream.connect(analyserNode);
    
                    audioData = new Float32Array(analyserNode.fftSize);
                    corrolatedSignal = new Float32Array(analyserNode.fftSize);
    
                    setInterval(() => {
                        analyserNode.getFloatTimeDomainData(audioData);
    
                        let pitch = getAutocorrolatedPitch();
    
                        frequencyDisplayElement.innerHTML = `${pitch}`;
                    }, 300);
                })
                .catch((err) =>
                {
                    console.log(err);
                });
        }
    
        function getAutocorrolatedPitch()
        {
            // First: autocorrolate the signal
    
            let maximaCount = 0;
    
            for (let l = 0; l < analyserNode.fftSize; l++) {
                corrolatedSignal[l] = 0;
                for (let i = 0; i < analyserNode.fftSize - l; i++) {
                    corrolatedSignal[l] += audioData[i] * audioData[i + l];
                }
                if (l > 1) {
                    if ((corrolatedSignal[l - 2] - corrolatedSignal[l - 1]) < 0
                        && (corrolatedSignal[l - 1] - corrolatedSignal[l]) > 0) {
                        localMaxima[maximaCount] = (l - 1);
                        maximaCount++;
                        if ((maximaCount >= localMaxima.length))
                            break;
                    }
                }
            }
    
            // Second: find the average distance in samples between maxima
    
            let maximaMean = localMaxima[0];
    
            for (let i = 1; i < maximaCount; i++)
                maximaMean += localMaxima[i] - localMaxima[i - 1];
    
            maximaMean /= maximaCount;
    
            return audioCtx.sampleRate / maximaMean;
        }
    </script>
    </body>
    </html>
    

    Question 2: Detecting multiple notes

    At this point I think we can all agree that this answer has gotten a little out of hand. So far we've just covered a single method of pitch detection. See Ref [2, 3, 4] for some suggestions of algorithms for multiple f0 detection.

    In essence, this problem would come down to detecting all f0s and looking up the resulting notes against a dictionary of chords. For that, there should at least be a little work done on your part. Any questions about the DSP should probably be pointed toward https://dsp.stackexchange.com. You will be spoiled for choice on questions regarding pitch detection algorithms

    References

    1. A. v. Knesebeck and U. Zölzer, "Comparison of pitch trackers for real-time guitar effects", in Proceedings of the 13th International Conference on Digital Audio Effects (DAFx-10), Graz, Austria, September 6-10, 2010.
    2. A. P. Klapuri, "A perceptually motivated multiple-F0 estimation method," IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005., 2005, pp. 291-294, doi: 10.1109/ASPAA.2005.1540227.
    3. A. P. Klapuri, "Multiple fundamental frequency estimation based on harmonicity and spectral smoothness," in IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 804-816, Nov. 2003, doi: 10.1109/TSA.2003.815516.
    4. A. P. Klapuri, "Multipitch estimation and sound separation by the spectral smoothness principle," 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001, pp. 3381-3384 vol.5, doi: 10.1109/ICASSP.2001.940384.
    5. A. M. Stark and M. D. Plumbley, "Real-Time Chord Recognition For Live Performance", In Proceedings of the 2009 International Computer Music Conference (ICMC 2009), Montreal, Canada, 16-21 August 2009.
    6. Alain de Cheveigné, Hideki Kawahara; YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 1 April 2002; 111 (4): 1917–1930.