I've managed to decode and play H264 videos, however I'm having a difficult time with MPEG4 videos.
What CMVideoFormatDescription extensions does it need? I'm getting -8971
error (codecExtensionNotFoundErr) when trying to create a VTDecompressionSession.
This is how I create a VideoFormatDescription
OSStatus success = CMVideoFormatDescriptionCreate(kCFAllocatorDefault,
self.mediaCodec,
message.frameSize.width,
message.frameSize.height,
NULL,
&mediaDescriptor);
Instead of that NULL, I assume I need to specify a CFDictionaryRef, however I don't know what it should contain. Any idea?
After much pain and agony, I've finally managed to make it work.
I need to provide a CFDictionaryRef with at least a value for the kCMFormatDescriptionExtension_SampleDescriptionExtensionAtoms
key. The value for this key also has to be a CFDictionaryRef. For H264 types this is created inside the CMVideoFormatDescriptionCreateFromH264ParameterSets
and looks like this:
avcC = <014d401e ffe10016 674d401e 9a660a0f ff350101 01400000 fa000013 88010100 0468ee3c 80>
However for the MPEG4 type, you need to create this on your own. The end result should look like this:
esds = <00000000 038081e6 00000003 8081e611 00000000 00000000 058081e5 060102>
Now the way to create this is still fuzzy to me, however it somehow works. I was inspired by this link. This is the code:
- (CMFormatDescriptionRef)createFormatDescriptorFromMPEG4Message:(MessageContainer *)message {
CMVideoFormatDescriptionRef mediaDescriptor = NULL;
NSData *esdsData = [self newESDSFromData:message.frameData];
CFMutableDictionaryRef esdsDictionary = CFDictionaryCreateMutable(kCFAllocatorDefault, 1,
&kCFTypeDictionaryKeyCallBacks,
&kCFTypeDictionaryValueCallBacks);
CFDictionarySetValue(esdsDictionary, CFSTR("esds"), (__bridge const void *)(esdsData));
NSDictionary *dictionary = @{(__bridge NSString *)kCMFormatDescriptionExtension_SampleDescriptionExtensionAtoms : (__bridge NSDictionary *)esdsDictionary};
OSStatus status = CMVideoFormatDescriptionCreate(kCFAllocatorDefault,
self.mediaCodec,
message.frameSize.width,
message.frameSize.height,
(__bridge CFDictionaryRef)dictionary,
&mediaDescriptor);
if (status) {
NSLog(@"CMVideoFormatDesciprionCreate failed with %zd", status);
}
return mediaDescriptor;
}
- (NSData *)newESDSFromData:(NSData *)data {
NSInteger dataLength = data.length;
int full_size = 3 + 5 + 13 + 5 + dataLength + 3;
// ES_DescrTag data + DecoderConfigDescrTag + data + DecSpecificInfoTag + size + SLConfigDescriptor
int config_size = 13 + 5 + dataLength;
int padding = 12;
int8_t *esdsInfo = calloc(full_size + padding, sizeof(int8_t));
//Version
esdsInfo[0] = 0;
//Flags
esdsInfo[1] = 0;
esdsInfo[2] = 0;
esdsInfo[3] = 0;
//ES_DescrTag
esdsInfo[4] |= 0x03;
[self addMPEG4DescriptionLength:full_size
toPointer:esdsInfo + 5];
//esid
esdsInfo[8] = 0;
esdsInfo[9] = 0;
//Stream priority
esdsInfo[10] = 0;
//DecoderConfigDescrTag
esdsInfo[11] = 0x03;
[self addMPEG4DescriptionLength:config_size
toPointer:esdsInfo + 12];
//Stream Type
esdsInfo[15] = 0x11;
//Buffer Size
esdsInfo[16] = 0;
esdsInfo[17] = 0;
//Max bitrate
esdsInfo[18] = 0;
esdsInfo[19] = 0;
esdsInfo[20] = 0;
//Avg bitrate
esdsInfo[21] = 0;
esdsInfo[22] = 0;
esdsInfo[23] = 0;
//< DecSpecificInfoTag
esdsInfo[24] |= 0x05;
[self addMPEG4DescriptionLength:dataLength
toPointer:esdsInfo + 25];
//SLConfigDescrTag
esdsInfo[28] = 0x06;
//Length
esdsInfo[29] = 0x01;
esdsInfo[30] = 0x02;
NSData *esdsData = [NSData dataWithBytes:esdsInfo length:31 * sizeof(int8_t)];
free(esdsInfo);
return esdsData;
}
- (void)addMPEG4DescriptionLength:(NSInteger)length
toPointer:(int8_t *)ptr {
for (int i = 3; i >= 0; i--) {
uint8_t b = (length >> (i * 7)) & 0x7F;
if (i != 0) {
b |= 0x80;
}
ptr[3 - i] = b;
}
}
The message container is a simple wrapper around the data received from the server:
@interface MessageContainer : NSObject
@property (nonatomic) CGSize frameSize;
@property (nonatomic) NSData *frameData;
@end
Where frameSize
is the size of the frame (received separately from the server) and frameData
is the data itself.