iosaudiovideo-streamingh.264muxer

Muxing compressed frames from VTCompressionSession with audio data into an MPEG2-TS container for network streaming


I'm working on a project that involves grabbing H.264 encoded frames from VTCompressionSession in iOS8, muxing them with live AAC or PCM audio from the microphone into a playable MPEG2-TS and streaming that over a socket in real time with minimum delay (i.e: (almost) no buffering).

After watching the presentation for the new VideoToolbox in iOS8 and doing some research I guess it's safe to assume that:

I'm pretty new to this. Can I get some advice on what would be the right approach to achieve this?.

Is there an easier way to implement this using Apple APIs (like AVFoundation)?

Is there any similar project I can take as a reference?

Thanks in advance!


Solution

  • In order to mux the Packetized Elementary Streams I would also need some library like libmpegts. Or perhaps ffmpeg (by using libavcodec and libavformat libraries).

    From what I can gather, there is no way to mux TS with AVFoundation or related frameworks. While it seems like something one can do manually, I'm trying to use the Bento4 library to accomplish the same task as you. I'm guessing libmpegts, ffmpeg, GPAC, libav, or any other library like that would work too, but I didn't like their APIs.

    Basically, I'm following Mp42Ts.cpp, ignoring the Mp4 parts and just looking at the Ts writing parts.

    This StackOverflow question has all the outline of how to feed it video, and implementation of how to feed it audio. If you have any questions, ping me with a more specific question.

    I hope this provides a good starting point for you, though.

    I would also need an AAC or PCM elementary stream from the microphone data (I presume PCM would be easier since no encoding is involved). Which I don't know how to do either.

    Getting the microphone data as AAC is very straightforward. Something like this:

    AVCaptureDevice *microphone = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
    _audioInput = [AVCaptureDeviceInput deviceInputWithDevice:microphone error:&error];
    
    if (_audioInput == nil) {
        NSLog(@"Couldn't open microphone %@: %@", microphone, error);
        return NO;
    }
    
    _audioProcessingQueue = dispatch_queue_create("audio processing queue", DISPATCH_QUEUE_SERIAL);
    
    _audioOutput = [[AVCaptureAudioDataOutput alloc] init];
    [_audioOutput setSampleBufferDelegate:self queue:_audioProcessingQueue];
    
    
    NSDictionary *audioOutputSettings = @{
        AVFormatIDKey: @(kAudioFormatMPEG4AAC),
        AVNumberOfChannelsKey: @(1),
        AVSampleRateKey: @(44100.),
        AVEncoderBitRateKey: @(64000),
    };
    
    _audioWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:audioOutputSettings];
    _audioWriterInput.expectsMediaDataInRealTime = YES;
    if(![_writer canAddInput:_audioWriterInput]) {
        NSLog(@"Couldn't add audio input to writer");
        return NO;
    }
    [_writer addInput:_audioWriterInput];
    
    [_captureSession addInput:_audioInput];
    [_captureSession addOutput:_audioOutput];
    
    - (void)audioCapture:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
    {
        /// sampleBuffer contains encoded aac samples.
    }
    

    I'm guessing you're using an AVCaptureSession for your camera already; you can use the same capture session for the microphone.