androidkotlinfftfrequencyvisualizer

Getting variable frequency ranges with androids visualizer class


I want to get values for certain ranges of frequencies of the sound that is played by the smartphone so I can forward them via Bluetooth to a device that visualizes these ranges. Those ranges are:
0-63Hz
63-160Hz
160-400Hz
400-1000Hz
1000-2.500Hz
2.500-6.250Hz
6.250-16.000Hz

Audio Session Id is 0 so I can use any sound played by the smartphone.

What I found is the visualizer class and I thought I could achieve that with the getFft method. Though it seems like I can only separate the frequencies into same sized parts with the capture rate? Or am I completely misunderstanding something here? I tried just using the sampling rate as capture rate so I would have a value for each frequency but it just would set the capture rate to 1024 again.
Or maybe this class just isn't what I want? I think I might completely miss the point here, so any help or explanation (or recommendation of another library) would be welcome.

        val visualizer = Visualizer(0)
        visualizer.scalingMode = 0

        visualizer.setDataCaptureListener(object : Visualizer.OnDataCaptureListener {
            override fun onWaveFormDataCapture(
                vis: Visualizer,
                bytes: ByteArray,
                samplingRate: Int
            ) {

            }

            override fun onFftDataCapture(
                visualizer: Visualizer?,
                fft: ByteArray?,
                samplingRate: Int
            ) {
                //if frequency <=63 do something
                //else if frequency <=160 do something ...
            }

        }, Visualizer.getMaxCaptureRate() / 2, false, true)
        visualizer.enabled = true



Solution

  • It is inherent to the math of how an FFT is calculated that it will produce frequency "buckets" that are evenly sized and with a count that is equal to half the sample size and go up to a frequency that is half the sample rate. (An FFT actually produces buckets equal to the sample size, but Android's Visualizer goes ahead and dumps the second half before delivering the results because they contain a reflection of the first half, and so are not useful for visualization.)

    There is going to be a very limited range of permitted capture sizes and capture rates based on hardware capabilities and plain old physics. Also, these two properties are inversely proportional. If your capture size is big, your capture rate has to be small. Audio is produced as a stream of evenly timed amplitudes (where the spacing is the samplingRate). Suppose for simplicity the audio stream is at 1024 Hz only, producing 1024 amplitudes per second. If your capture rate is 1 per second, you are collecting all 1024 of those amplitudes each time you capture, so your capture size is 1024. If your capture rate is 2 per second, you are collecting 512 amplitudes on each capture, so your capture size is 512.

    Note, I don't know for sure is if you set a capture size and it doesn't inversely match your capture rate used in setDataCaptureListener, whether it ignores the size you set or actually repeats/drops data. I always use Visualizer.getMaxCaptureRate() as the capture rate.

    What you can do (and it won't be exact) is average the appropriate ranges, although I think you'll want to apply the log function to the magnitude before you average, or the results won't look great. You definitely need to apply a log function to the magnitudes at some point before visualizing them for a visualizer to make sense to the viewer.

    So after selecting a capture size you can prepare ranges to use for collecting the results.

    private val targetEndpoints = listOf(0f, 63f, 160f, 400f, 1000f, 2500f, 6250f, 16000f)
    private val DESIRED_CAPTURE_SIZE = 1024 // A typical value, has worked well for me
    private lateinit var frequencyOrdinalRanges: List<IntRange>
    //...
    
    val captureSizeRange = Visualizer.getCaptureSizeRange().let { it[0]..it[1] }
    val captureSize = DESIRED_CAPTURE_SIZE.coerceIn(captureSizeRange)
    visualizer.captureSize = captureSize
    val samplingRate = visualizer.samplingRate
    frequencyOrdinalRanges = targetEndpoints.zipWithNext { a, b ->
            val startOrdinal = 1 + (captureSize * a / samplingRate).toInt()
            // The + 1 omits the DC offset in the first range, and the overlap for remaining ranges
            val endOrdinal = (captureSize * b / samplingRate).toInt()
            startOrdinal..endOrdinal
        }
    

    And then in your listener

    override fun onFftDataCapture(
        visualizer: Visualizer,
        fft: ByteArray,
        samplingRate: Int
    ) {
        val output = FloatArray(frequencyOrdinalRanges.size)
        for ((frequencyOrdinalRange, i) in frequencyOrdinalRanges.withIndex) {
            var logMagnitudeSum = 0f
            for (k in ordinalRange) {
                val fftIndex = k * 2
                logMagnitudeSum += log10(hypot(fft[fftIndex].toFloat(), fft[fftIndex + 1].toFloat()))
            }
            output[i] = logMagnitudeSum / (ordinalRange.last - ordinalRange.first + 1)
        }
        // If you want magnitude to be on a 0..1 scale, you can divide it by log10(hypot(127f, 127f))
        // Do something with output
    }
    

    I did not test any of the above, so there might be errors. Just trying to communicate the strategy.