winapiaudiomp4ms-media-foundation

Corrupted Media Type when saving MP4 and then reading it with Media Foundation


As described in this question I'm trying to create a custom audio codec and save it with the Sink Writer in a MP4 file. I succeeded with the aid of setting MF_MT_MPEG4_SAMPLE_DESCRIPTION, following some information from this link.

The sample descriptor I set is this one:

UINT8 cz[72] = {
0x0,0x0,0x0,0x48, // len
0x73,0x74,0x73,0x64, // stsd
0x0,0x0,0x0,0x0, // verflag
0x0,0x0,0x0,0x1, // num
0x0,0x0,0x0,0x38, // len
'e','c','d','c', // code
0x0,0x0,0x0,0x0,0x0,0x0, // 6-null
// index
0x0,0x1, 
// version
0x0,0x1, 
// Revision Level
0x0,0x0,
// vendor
0x0,0x0,0x0,0x0,
// number channels
0x0,0x2,
// sample size
0x0,0x10,
// Compression ID
0xFF,0xFE,
// Packet Size,
0x0,0x0,
// Sample rate
0xBB,0x80,0x0,0x0,
// Sound Level 1 fields ?
0x0,0x0,0x0,0x0,
0x0,0x0,0x0,0x0,
0x0,0x0,0x0,0x0,
0x0,0x0,0x0,0x0,
};

My custom audio media type has the guid {0000ECDC-0000-0010-8000-00AA00389B71} , set with MF_MT_SUBTYPE. Just before calling Finalize I test the media type of the writing mp4 and it's indeed valid:

MF_MT_MPEG4_SAMPLE_DESCRIPTION  byte array
MF_MT_AUDIO_NUM_CHANNELS    2
MF_MT_MAJOR_TYPE    MFMediaType_Audio
MF_MT_AUDIO_SAMPLES_PER_SECOND  48000
MF_MT_MPEG4_CURRENT_SAMPLE_ENTRY    0
MF_MT_SUBTYPE   {0000ECDC-0000-0010-8000-00AA00389B71}

Now the weird thing, after reopening the file:

    CComPtr<IMFSourceReader> srr;
    FCreateSourceReaderFromURL(fi.c_str(), 0, &srr);
    CComPtr<IMFMediaType> c;
    srr->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &c);
    LogMediaType(c);

Now this time I get this weird thing:

MF_MT_AUDIO_AVG_BYTES_PER_SECOND    383
MF_MT_AVG_BITRATE   3071
MF_MT_MPEG4_SAMPLE_DESCRIPTION  byte array
MF_MT_AUDIO_NUM_CHANNELS    2
MF_MT_MAJOR_TYPE    MFMediaType_Audio
MF_MT_AUDIO_SAMPLES_PER_SECOND  48000
MF_MT_MPEG4_CURRENT_SAMPLE_ENTRY    0
MF_MT_AUDIO_BITS_PER_SAMPLE 16
MF_MT_SUBTYPE   {65636463-767A-494D-B478-F29D25DC9037}

Now I 'm forced to register with my decoder the weird {65636463-767A-494D-B478-F29D25DC9037} guid as subtype and also I get some garbage like the AVG bitrate.

What could cause this?

If I push the AAC descriptor, then the media type is correctly returned from the source reader. This is the AAC descriptor:

UINT8 caac[100] = {
0x0,0x0,0x0,0x64, // len
0x73,0x74,0x73,0x64, // stsd
0x0,0x0,0x0,0x0, // verflag
0x0,0x0,0x0,0x1, // num of dscriptiors
0x0,0x0,0x0,0x54, // len
0x6D,0x70,0x34,0x61, // 'mp4a' AAC 
0x0,0x0,0x0,0x0,0x0,0x0, //6-null
0x0,0x1, // index
0x0,0x0, // version
0x0,0x0, // revision
0x0,0x0,0x0,0x0, // vendor
0x0,0x2, // channels
0x0,0x10, // sample size
0x0,0x0, // compression ID
0x0,0x0, // packet size
// Sample rate
0xBB,0x80,0x0,0x0,
0x0,0x0,0x0,0x30, // 48 bytes
0x65,0x73,0x64,0x73, // 'esds'
0x0,0x0,0x0,0x0,0x3,0x80,0x80,0x80,0x1F,0x0,0x0,0x0,0x4,0x80,0x80,0x80,0x14,0x40,0x15,0x0,0x6,0x0,0x0,0x2,0xEE,0x0,0x0,0x2,0xEE,0x0,0x5,0x80,0x80,0x80,0x2,0x11,0x90,0x6,0x1,0x2
    };

So it includes an 'esds' descriptor... why? But even if I make a duplicate of the AAC code with the only difference being the 'ecdc' string, it still results in a corrupt media type.


Solution

  • The confusion you have in this and the previous questions is caused by this.

    When you create the file, your custom audio subtype does not mean much because essentially you are supplying the full stsd box which you are responsible for.

    When you read the file, you don't get your audio subtype back directly. Instead, the implementation sees the box you placed inside stsd, ecdc in your case, and creates a media type as described here:

    For any other codes not shown in the previous table, the MPEG-4 file source sets the subtype as follows:

    subtype = MFMPEG4Format_Base <<--- your weird GUID
    subtype.Data1 = sample entry code

    Given this, you should be able to fit your decoder to this media type produced by MPEG-4 media source, and get your decoder discovered. Or you at all times still have the option to just read the raw data from the source.