swift avfoundation core-video cvpixelbuffer

Do I have to lock a CVPixelBuffer produced from AVCaptureVideoDataOutput

I have a AVCaptureVideoDataOutput producing CMSampleBuffer instances passed into my AVCaptureVideoDataOutputSampleBufferDelegate function. I want to efficiently convert the pixel buffers into CGImage instances for usage elsewhere in my app.

I have to be careful not to retain any references to these pixel buffers or the capture session will start dropping frames for reason OutOfBuffers. Also, if the conversion takes too long then then frames will be discarded for reason FrameWasLate.

Previously I tried using a CIContext to render the CGImage but this proved to be too slow when capturing above 30 FPS, and I want to capture at 60 FPS. I tested and got up to 38 FPS before frames started getting dropped.

Now I am attempting to use a CGContext and the results are better. I'm still dropping frames, but significantly less frequently.

public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {

    // Capture at 60 FPS but only process at 4 FPS, ignoring all other frames
    let timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
    guard timestamp - lastTimestamp >= CMTimeMake(value: 1, timescale: 4) else { return }

    // Extract pixel buffer
    guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

    // Lock pixel buffer before accessing base address
    guard kCVReturnSuccess == CVPixelBufferLockBaseAddress(imageBuffer, .readOnly) else { return }
    defer { CVPixelBufferUnlockBaseAddress(imageBuffer, .readOnly) }

    // Use CGContext to render CGImage from pixel buffer
    guard let cgimage = CGContext(data: CVPixelBufferGetBaseAddress(imageBuffer),
                                  width: CVPixelBufferGetWidth(imageBuffer),
                                  height: CVPixelBufferGetHeight(imageBuffer),
                                  bitsPerComponent: 8,
                                  bytesPerRow: CVPixelBufferGetBytesPerRow(imageBuffer),
                                  space: cgColorSpace,
                                  bitmapInfo: cgBitmapInfo).makeImage() else { return }

    // Do something with cgimage...
}

I was curious and next tried this without locking the pixel buffer base address. When I comment out those two lines, I stop dropping frames completely without any noticeable repercussions. It seems that the lock mechanism was taking so long that frames were being dropped, and removing the mechanism significantly reduced the function's running time and allowed all frames to be handled.

Apple's documentation explicitly states that calling CVPixelBufferLockBaseAddress is required prior to CVPixelBufferGetBaseAddress. However, because the AVCaptureVideoDataOutput is using a pre-defined pool of memory for its sample buffers, perhaps the base address isn't subject to change like would normally be the case.

Can I skip locking the base address here? What is the worst that could happen if I don't lock the base address in this specific scenario?

Solution

This question was ill-founded from the start because I neglected to test the actual image result from skipping the lock. As stated in the question, when I lock the base address prior to initializing the CGContext, the makeImage render would takes approximately 17 milliseconds. If I skip the locking and go straight to the CGContext then the makeImage takes 0.3 milliseconds.

I had wrongly interpreted this speed difference to mean that the rendering was being accelerated by the GPU in the latter case. However, what was actually happening was the CVPixelBufferGetBaseAddress was returning nil and the makeImage was rendering no data- producing a purely white CGImage.

So, in short, the answer to my question is yes. The base address must be locked.

Now I am off to figure out how to speed this up. I am capturing at 60 FPS which means I want my rendering to take less than 16 milliseconds if possible so as to drop the CMSampleBuffer reference before the next one arrives.