javascriptweb-audio-api

What are the frequencies in Hertz of the "bins" that AnalyserNode.getByteFrequencyData() returns?


I am currently working on a project that uses JavaScript's WebAudio API to show a graph of the frequencies in the audio. AnalyserNode.getByteFrequencyData() fills an Array which is divided into "bins", each with a frequency range. I would like to know what frequencies (measured in Hertz) those "bins" correspond to.

I've tried interpreting each bin as a 1Hz step, but that clearly isn't right.


Solution

  • The frequency of the kth bin is k/N*sr where sr is the sampling rate and N is the number of points in the FFT.

    I think the MDN doc is incorrect, unless they're doing something pretty non-standard with their FFT. If you have a sampling rate of 48000 and do a 128-point FFT, the last value in getByteFrequencyData() has index 63, and has frequency 63/128*48000 = 23625Hz.

    In general, an N-point FFT1 takes N time-domain samples and gives you N frequency-domain samples. If your sampling frequency is sr, then the bin spacing is sr/N per bin2. So if you have a sampling rate of 8kHz and did an 8-point FFT, the bin frequencies would be

    [0, 1k, 2k, 3k, 4k, 5k, 6k, 7k]
    

    You can also think about frequencies above the "Nyquist" (sr/2), as being negative frequencies3, so you could renumber that list as:

    [0, 1k, 2k, 3k, 4k, -3k, -2k, -1k]
    

    If the time-domain signal is real-valued (which is the case for audio), then the spectrum is symmetric4, so there's no new information in the negative frequencies. DSP engineers working with audio signals frequently only use the non-negative frequencies, and most FFT libraries have a rfft function for this purpose. For an N-point FFT, the rfft function typically gives you a result with length N/2+1 which avoids needing to compute the negative frequencies.

    For some reason, the WebAudio designers dropped the last frequency bin, which has frequency sr/2. In practice, there's not usually much of interest happening up there at the nyquist frequency, and the analyser node only gives you the magnitudes of the spectrum, so it's more intended for visualization than for fancy frequency-domain processing anyways.


    1 generally called a DFT (discrete fourier transform) in math. The FFT (fast fourier transform) is a particular algorithm to compute the DFT.

    2 If you look at the definition of the DFT, for the kth spectral bin, you use the sinusoid w(n) = exp(-i*2*pi*k/N*n), which has frequency k/N. The sampling rate is 1 because we're measuring time in samples.

    3 Conceptually, the DFT result is periodic with period N, so it's always true that X[k] = X[k+N]. remember k is the index here, not the bin frequency, and we're allowing negative indices to wrap around from the end, numpy-style.

    4 specifically conjugate symmetric so X[-k] = conj(X[k])