swiftaudioavfoundationavassetwriteravassetwriterinput

How do you sync video and audio using AVAssetWriter?


Background

I use AVAssetWriterInput.append to append sample buffers to the writer. Sometimes, I switch off the audio input (if the user wants to temporarily disable audio input), so the append method will not be executed while the append method in the video input will always be executed.

Problem

If the user pauses the audio and resumes it later, the audio after resuming will immediately begin when the user pauses it (in the final video).

Example

'=' refers to CMSampleBuffer.

'|' means the user paused the audio input.


Video: ---------------=================================

Audio(expected): ----=======|----------------=============

Audio(I got): ---------=======|=============----------------


Code

func appendBuffer(_ buffer: CMSampleBuffer, of type: BufferType) {
    guard let writer else { return }
    guard writer.status == .writing else {
        logger.warning("AVAssetWriter is not ready. Status: \(writer.status.rawValue). Error: \(writer.error.debugDescription)")
        return
    }
    
    // Start a session when receives the first frame.
    if isFirstFrame && type == .screenVideo {
        startInputPipeline(with: buffer)
        isFirstFrame = false
    }
    
    guard isWriting else { return }
    
    switch type {
    case .video:
        // Check the status of the buffer to decide whether to append it or not.
        guard statusOfSampleBuffer(buffer) == .complete else { return }
        if videoInput?.isReadyForMoreMediaData == true {
            guard buffer.imageBuffer != nil else {
                logger.info("Complete but no updated pixels.")
                return
            }
            processQueue.async { [self] in
                videoInput?.append(buffer)
            }
        }
    case .audio:
        if audioInput?.isReadyForMoreMediaData == true {
            guard buffer.dataBuffer != nil else { return }
            processQueue.async { [self] in
                audioInput?.append(buffer)
            }
        }
    }
}

I have printed the presentationTime from the audio sample buffer. It turns out it's correct.

Maybe my understanding of AVAssetWriterInput.append is wrong?

My current solution is to always append the buffer, but when the user wants to pause, I simply append an empty SampleBuffer filled with nothing.

I don't think this is the best way to deal with it.

Is there any way to sync the buffer time with the video?


Solution

  • I would recommend stepping away form pausing the writing and just write silent audio buffers or black video frames It's easy to do

    For Audio

        private func _cacheAudioBuffer(sampleBuffer: CMSampleBuffer, isMuted: Bool) {
        if isMuted,
           let ref = CMSampleBufferGetDataBuffer(sampleBuffer) {
            CMBlockBufferFillDataBytes(
                with: 0,
                blockBuffer: ref,
                offsetIntoDestination: 0,
                dataLength: CMBlockBufferGetDataLength(ref))
        }
        // Continue here with muted samplebuffer
    }
    

    For Video (you can play with the frame properties to fit your needs)

        private func _createBlackPixelBuffer(with image: UIImage?, with size: PresetDimension) -> CVPixelBuffer? {
        guard self._blackPixelBuffer == nil else { return self._blackPixelBuffer }
        var pixelBuffer: CVPixelBuffer?
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
        let width = Int(size.width)
        let height = Int(size.height)
        CVPixelBufferCreate(kCFAllocatorDefault, width, height, kCVPixelFormatType_32BGRA, attrs, &pixelBuffer)
        let context = CIContext()
        guard let image = image,
            let ciImage = CIImage(image: image) else {
            return nil
        }
        context.render(ciImage, to: pixelBuffer!)
        return pixelBuffer
    }
    

    Or if you wish to stop the file when you pause, I would stop the writing and start a new one when the user starts it again