iphoneios11avassetwritercmsamplebuffer

iPhone11 unexpected number of Audio Samples


I have an app that captures audio and video using AVAssetWriter. It runs a fast fourier transform (FFT) on the audio to create a visual spectrum of the captured audio in real time.

Up until the release of iPhone11, this all worked fine. Users with the iPhone 11, however, are reporting that audio is not being captured at all. I have managed to narrow down the issue - The number of samples returned in captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) is either 940 or 941 - On previous phone models, this is always 1024 samples. I use CMSampleBufferGetNumSamples to get the number of samples. My FFT calculations rely on having the number of samples be a power of 2, so it drops all frames on the newer model iPhones.

Can anybody shed light on why the new iPhone11 is returning an unusual number of samples? Here is how I have configured the AVAssetWriter:

self.videoWriter = try AVAssetWriter(outputURL: self.outputURL, fileType: AVFileType.mp4)
var videoSettings: [String : Any]
if #available(iOS 11.0, *) {
    videoSettings = [
        AVVideoCodecKey  : AVVideoCodecType.h264,
        AVVideoWidthKey  : Constants.VIDEO_WIDTH,
        AVVideoHeightKey : Constants.VIDEO_HEIGHT,
    ]
} else {
    videoSettings = [
        AVVideoCodecKey  : AVVideoCodecH264,
        AVVideoWidthKey  : Constants.VIDEO_WIDTH,
        AVVideoHeightKey : Constants.VIDEO_HEIGHT,
    ]
}

//Video Input
videoWriterVideoInput = AVAssetWriterInput(mediaType: AVMediaType.video, outputSettings: videoSettings)
videoWriterVideoInput?.expectsMediaDataInRealTime = true;
if (videoWriter?.canAdd(videoWriterVideoInput!))!
{
    videoWriter?.add(videoWriterVideoInput!)
}

//Audio Settings
let audioSettings : [String : Any] = [
    AVFormatIDKey : kAudioFormatMPEG4AAC,
    AVSampleRateKey : Constants.AUDIO_SAMPLE_RATE, //Float(44100.0)
    AVEncoderBitRateKey : Constants.AUDIO_BIT_RATE, //64000
    AVNumberOfChannelsKey: Constants.AUDIO_NUMBER_CHANNELS //1
]

//Audio Input
videoWriterAudioInput = AVAssetWriterInput(mediaType: AVMediaType.audio, outputSettings: audioSettings)
videoWriterAudioInput?.expectsMediaDataInRealTime = true;
if (videoWriter?.canAdd(videoWriterAudioInput!))!
{
    videoWriter?.add(videoWriterAudioInput!)
}



Solution

  • You can't assume a fixed sample rate. Depending on the microphone and many other factors of a device, you can't always assume it will be the same. This doesn't help with the FFT library I'm using (TempiFFT) - To get this to work you need to detect the sample rate ahead of time.

    Rather than:

    let fft = TempiFFT(withSize: 1024, sampleRate: Constants.AUDIO_SAMPLE_RATE)
    

    I need to first detect what the sample rate is when I start my AVCaptureSession, and then pass that detected value to the FFT library:

    //During initialization of AVCaptureSession
    audioSampleRate = Float(AVAudioSession.sharedInstance().sampleRate)
    ...
    //Run FFT calculations
    let fft = TempiFFT(withSize: 1024, sampleRate: audioSampleRate)
    

    Update

    On some devices, you may not receive a full 1024 samples in your loop (on iPhone 11 I was getting 941) - if it doesn't have the right number of frames, you may get unexpected behavior from the FFT. I needed to create a circular buffer to store the samples upon return of each output til I had at least 1024 samples to perform the FFT.