c++ffmpegsegmentation-faultportaudiolibswresample

PortAudio To FFmpeg Resampling Resulting in Segmentation Fault


I am getting audio from a microphone with PortAudio (PA). I need to then resample this audio to 44,100KHz. I'm attempting to do this with FFmpeg. Currently, the mic I'm testing with has a sample rate of 48,000KHz, but this won't always be the case when the application is used. Anyway, whenever I attempt to resample with swr_convert, I get a segmentation fault. I am initializing the SwrContext with

this->swr_ctx = swr_alloc_set_opts(
    nullptr,                   // No current context
    num_channels,              // The number of channls I'm getting from PA
    AV_SAMPLE_FMT_S16,         // 16 bit Signed, should correspond to paInt16
    FINAL_SAMPLE_RATE,         // 44100
    num_channels,              // The number of channls I'm getting from PA
    AV_SAMPLE_FMT_S16,         // 16 bit Signed, should correspond to paInt16
    this->source_sample_rate,  // Mic I'm testing with currently is 44800, but depends on source
    0,                         // Logging offset (0 is what examples use, so I did too)
    nullptr                    // "parent logging context, can be NULL"
);

I know PA is working right, as the project works if I hard-code the sample rate in other aspects of this project. The callback looks like this

auto paCallback( const void *inputBuffer, void *outputBuffer, unsigned long framesPerBuffer, const PaStreamCallbackTimeInfo* timeInfo, PaStreamCallbackFlags statusFlags, void *userData ) -> int {
    // Calls class's callback handler
    return ((Audio*)userData)->classPaCallback((const uint8_t **)inputBuffer);
}
// Class callback handler
auto Audio::classPaCallback(const uint8_t **inputBuffer) -> int {
    // This line throws SIGSEGV
    int out_count = swr_convert(swr_ctx, this->resample_buffer, BUFFER_CHUNK_SIZE, inputBuffer, this->source_buffer_size);
    if (out_count < 0)
        throw std::runtime_error("Error resampling audio");
    }
    // Add data to buffers to handle outside of callback context (This is a special context according to PA docs)
    return 0;
}

Playing around with the swr_convert line, I changed the out_count and in_count parameters (BUFFER_CHUNK_SIZE and this->source_buffer_size) to 0 to make sure that the code would at least run, that worked. I then changed one of them to 1, and left the other at 0, to test which buffer access was throwing the SIGSEGV, and it was thrown when the in_count (buffer from PA) was not 0. What am I doing wrong when passing the audio from PA to FFMpeg?

I do know that the PA audio is "interleaved" (I.E. input[0] is the first sample from channel 0, input[1] is the first sample from channel 1, ect.). Is this also the format that FFMpeg uses, or should I create a different SwrContext for each channel?

In case this is helpful, this->resample_buffer is successfully initialized with

av_samples_alloc_array_and_samples(
    &this->resample_buffer, // Buffer
    nullptr,                // "linesize", not used from what I could tell
    this->num_channels,     // number of channels expected
    BUFFER_CHUNK_SIZE,      // Number of frames to be stored per channel
    AV_SAMPLE_FMT_S16,      // 16 bit Signed, should correspond to paInt16
    0                       // For alignment, not needed here
);

Solution

  • Since resampling directly after receiving the data from PortAudio (PA) was not working, and no one said anything right away, I knew I wasn't making any "dumb" mistakes. After a bit of a break, I decided to try again, but instead of resampling right away, first adding it to the appropriate buffer (based on which channel it was from PA). Then, when removing the data from the buffer, I applied the resample. Doing it this way allowed me to do resampling on one channel, therefore not having to wonder how to include multiple channels for FFMpeg. For anyone who comes across this in the future, the code ended up looking like this

    auto *converted_sample = (uint16_t *) malloc(sizeof(uint16_t) * this->converted_sample_max);
        if (converted_sample == nullptr) {
            throw std::runtime_error("Failed to allocate memory for converted sample");
        }
        uint16_t *sample = buffer.pop(); // Gets data from circular buffer
        if (sample == nullptr) {
            free(converted_sample); // IMPORTANT to not incur a memory leak
            return return_data{nullptr, 0};
        }
        if (swr_ctx == nullptr) {
            free(converted_sample); // IMPORTANT to not incur a memory leak
            throw std::runtime_error("swr_ctx is not initialized");
        }
        int frames = swr_convert(swr_ctx, (uint8_t **)&converted_sample, this->converted_sample_max, (const uint8_t **)&sample, BUFFER_CHUNK_SIZE);
        free(sample); // In my case, it is the job of whoever takes the data out of the circular buffer to free() it
        if (frames < 0) {
            free(converted_sample); // Prevent memory leak
            throw std::runtime_error("no frames converted");
        }
        return return_data{converted_sample, frames}; // A struct I made
    }
    

    this->converted_sample_max is initialized with

    this->converted_sample_max = av_rescale_rnd(BUFFER_CHUNK_SIZE, FINAL_SAMPLE_RATE, source_sample_rate, AV_ROUND_UP);