iosvideo-streamingmpeg-4

CMVideoFormatDescription extensions for MPEG4 streams


I've managed to decode and play H264 videos, however I'm having a difficult time with MPEG4 videos.

What CMVideoFormatDescription extensions does it need? I'm getting -8971 error (codecExtensionNotFoundErr) when trying to create a VTDecompressionSession.

This is how I create a VideoFormatDescription

OSStatus success = CMVideoFormatDescriptionCreate(kCFAllocatorDefault,
                                                  self.mediaCodec,
                                                  message.frameSize.width,
                                                  message.frameSize.height,
                                                  NULL,
                                                  &mediaDescriptor);

Instead of that NULL, I assume I need to specify a CFDictionaryRef, however I don't know what it should contain. Any idea?


Solution

  • After much pain and agony, I've finally managed to make it work.

    I need to provide a CFDictionaryRef with at least a value for the kCMFormatDescriptionExtension_SampleDescriptionExtensionAtoms key. The value for this key also has to be a CFDictionaryRef. For H264 types this is created inside the CMVideoFormatDescriptionCreateFromH264ParameterSets and looks like this:

    avcC = <014d401e ffe10016 674d401e 9a660a0f ff350101 01400000 fa000013 88010100 0468ee3c 80>
    

    However for the MPEG4 type, you need to create this on your own. The end result should look like this:

    esds = <00000000 038081e6 00000003 8081e611 00000000 00000000 058081e5 060102>
    

    Now the way to create this is still fuzzy to me, however it somehow works. I was inspired by this link. This is the code:

    - (CMFormatDescriptionRef)createFormatDescriptorFromMPEG4Message:(MessageContainer *)message {
        CMVideoFormatDescriptionRef mediaDescriptor = NULL;
        NSData *esdsData = [self newESDSFromData:message.frameData];
    
        CFMutableDictionaryRef esdsDictionary = CFDictionaryCreateMutable(kCFAllocatorDefault, 1,
                                                                          &kCFTypeDictionaryKeyCallBacks,
                                                                          &kCFTypeDictionaryValueCallBacks);
        CFDictionarySetValue(esdsDictionary, CFSTR("esds"), (__bridge const void *)(esdsData));
    
        NSDictionary *dictionary = @{(__bridge NSString *)kCMFormatDescriptionExtension_SampleDescriptionExtensionAtoms : (__bridge NSDictionary *)esdsDictionary};
    
        OSStatus status = CMVideoFormatDescriptionCreate(kCFAllocatorDefault,
                                                         self.mediaCodec,
                                                         message.frameSize.width,
                                                         message.frameSize.height,
                                                         (__bridge CFDictionaryRef)dictionary,
                                                         &mediaDescriptor);
        if (status) {
            NSLog(@"CMVideoFormatDesciprionCreate failed with %zd", status);
        }
    
        return mediaDescriptor;
    }
    
    
    - (NSData *)newESDSFromData:(NSData *)data {
        NSInteger dataLength = data.length;
    
        int full_size = 3 + 5 + 13 + 5 + dataLength + 3;
    
        // ES_DescrTag data + DecoderConfigDescrTag + data + DecSpecificInfoTag + size + SLConfigDescriptor
        int config_size = 13 + 5 + dataLength;
        int padding = 12;
    
        int8_t *esdsInfo = calloc(full_size + padding, sizeof(int8_t));
    
        //Version
        esdsInfo[0] = 0;
    
        //Flags
        esdsInfo[1] = 0;
        esdsInfo[2] = 0;
        esdsInfo[3] = 0;
    
        //ES_DescrTag
        esdsInfo[4] |= 0x03;
        [self addMPEG4DescriptionLength:full_size
                              toPointer:esdsInfo + 5];
    
        //esid
        esdsInfo[8] = 0;
        esdsInfo[9] = 0;
    
        //Stream priority
        esdsInfo[10] = 0;
    
        //DecoderConfigDescrTag
        esdsInfo[11] = 0x03;
    
        [self addMPEG4DescriptionLength:config_size
                              toPointer:esdsInfo + 12];
    
        //Stream Type
        esdsInfo[15] = 0x11;
    
        //Buffer Size
        esdsInfo[16] = 0;
        esdsInfo[17] = 0;
    
        //Max bitrate
        esdsInfo[18] = 0;
        esdsInfo[19] = 0;
        esdsInfo[20] = 0;
    
        //Avg bitrate
        esdsInfo[21] = 0;
        esdsInfo[22] = 0;
        esdsInfo[23] = 0;
    
        //< DecSpecificInfoTag
        esdsInfo[24] |= 0x05;
    
        [self addMPEG4DescriptionLength:dataLength
                              toPointer:esdsInfo + 25];
    
        //SLConfigDescrTag
        esdsInfo[28] = 0x06;
    
        //Length
        esdsInfo[29] = 0x01;
    
        esdsInfo[30] = 0x02;
    
        NSData *esdsData = [NSData dataWithBytes:esdsInfo length:31 * sizeof(int8_t)];
    
        free(esdsInfo);
        return esdsData;
    }
    
    - (void)addMPEG4DescriptionLength:(NSInteger)length
                            toPointer:(int8_t *)ptr {
        for (int i = 3; i >= 0; i--) {
            uint8_t b = (length >> (i * 7)) & 0x7F;
            if (i != 0) {
                b |= 0x80;
            }
    
            ptr[3 - i] = b;
        }
    }
    

    The message container is a simple wrapper around the data received from the server:

    @interface MessageContainer : NSObject
    
    @property (nonatomic) CGSize frameSize;
    @property (nonatomic) NSData *frameData;
    
    @end
    

    Where frameSize is the size of the frame (received separately from the server) and frameData is the data itself.