[SOLVED] Distortion in ESP32 I2S audio playback with external DAC for sample frequency higher than 20kSps

Distortion in ESP32 I2S audio playback with external DAC for sample frequency higher than 20kSps

Hardware: ESP32 DevKitV1, PCM5102 breakout board, SD-card adapter.
Software: Arduino framework.

For some time I am struggling with audio playback using a I2S DAC external to ESP32. The problem is I can only play without distortion for low sample frequencies, i.e. below 20kSps. I have been studying the documentation, https://docs.espressif.com/projects/esp-idf/en/latest/api-reference/peripherals/i2s.html, and numerous other sources but sill haven't managed to fix this.

I2S configuration function:

esp_err_t I2Smixer::i2sConfig(int bclkPin, int lrckPin, int dinPin, int sample_rate)
{
    // i2s configuration: Tx to ext DAC, 2's complement 16-bit PCM, mono,
    const i2s_config_t i2s_config = {
        .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_TX | I2S_CHANNEL_MONO), // only tx, external DAC
        .sample_rate = sample_rate,
        .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
        .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT, // single channel
                                                      // .channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT, //2-channels
        .communication_format = (i2s_comm_format_t)(I2S_COMM_FORMAT_I2S | I2S_COMM_FORMAT_I2S_MSB),
        .intr_alloc_flags = ESP_INTR_FLAG_LEVEL3, // highest interrupt priority that can be handeled in c
        .dma_buf_count = 128, //16,
        .dma_buf_len = 128, // 64
        .use_apll = false,
        .tx_desc_auto_clear = true};

    const i2s_pin_config_t pin_config = {
        .bck_io_num = bclkPin,           //this is BCK pin
        .ws_io_num = lrckPin,            // this is LRCK pin
        .data_out_num = dinPin,          // this is DATA output pin
        .data_in_num = I2S_PIN_NO_CHANGE // Not used
    };
    esp_err_t ret1 = i2s_driver_install((i2s_port_t)i2s_num, &i2s_config, 0, NULL);
    esp_err_t ret2 = i2s_set_pin((i2s_port_t)i2s_num, &pin_config);
    esp_err_t ret3 = i2s_set_sample_rates((i2s_port_t)i2s_num, sample_rate);
    // i2s_adc_disable((i2s_port_t)i2s_num);
    // esp_err_t ret3 =  rtc_clk_apll_enable(1, 15, 8, 5, 6);

    return ret1 + ret2 + ret3;
}

A wave file, which was created in a 16 bit mono PCM, 44.1kHz format, is opened:

File sample_file = SD.open("/test.wav")

In the main loop, the samples are fed to the I2S driver.

esp_err_t I2Smixer::loop()
{
    esp_err_t ret1 = ESP_OK, ret2 = ESP_OK;
    int32_t output = 0;

        if (sample_file.available())
        {
            if (sample_file.size() - sample_file.position() > 2) // bytes left
            {
                int16_t tmp; // 16 bits signed PCM assumed
                sample_file.read((uint8_t *)&tmp, 2);
                output =(int32_t)tmp;
            }
            else
            {
                sample_file.close(); 
            }
        }

    size_t i2s_bytes_write;
    int16_t int16_t_output = (int16_t)output;
    ret1 = i2s_write((i2s_port_t)i2s_num, &int16_t_output, 2, &i2s_bytes_write, portMAX_DELAY);
    if (i2s_bytes_write != 2)
        ret2 = ESP_FAIL;

    return ret1 + ret2;
}

This works fine for sample rates up to 20 kSps. For a sample rate of 32k or 44.1k heavy distortion occurs. I suspect that this is caused by the I2S DMA Tx buffer. If the number of DMA buffers (dma_buf_count) and the buffer length (dma_buf_len) is increased, then the sound is played fine at first. Subsequently, after a short time, the distortion kicks in again. I cannot measure this short time span, maybe around a second, but I did notice it depends on the dma_buf_count and dma_buf_len.

Next to this, I tried increasing the CPU frequency to 240MHz, no improvement. Further I tried to play a file from SPIFSS, no improvement.

I am out of ideas right now, has anyone encountered this issue also?

Solution

Reading one sample at a time and pushing it to the I²S driver will not be the most efficient usage of the driver. You are using just 2 bytes in every 128 byte DMA buffer. That leaves just a single sample period to push the next sample before the DMA buffer is "starved".

Read the file in 128 byte (64 sample) chunks and write the whole chunk to the I²S in order to use the DMA effectively.

Depending on the file-system implementation it may be a little more efficient too to use larger chunks that are sympathetic to the file-system's media, sector size and DMA buffering.