mp4h.264encoderdecoderfmp4

Detection I Frame in h264 stream (fragmented mp4)


I need to check the first frame in H264 stream is I-Frame.

On the input I have fragmented mp4 file. I tried to find type of frame in the moof->traf->trun "Sample depends on" flag. But seems, not every container has this flag filled in. So I want to try determine type of frame in the mdat block with raw data.

I need to check only the first frame is I-frame in the each fragment. Information about other frames doesn't matter.

How I can do it?


Solution

  • You can check the NAL unit type. NAL unit type 5 indicates an IDR frame which is an I frame. Inside the 'mdat' the video is stored:

    <size><NAL><size><NAL>...<size><NAL>
    

    The lower 5 bits of the first byte of each NAL unit indicates the type. Skip through types 6,7,8 and 9 until you find type 1 (non IDR frame) or type 5 (IDR frame).

    MP4 files should not contain start codes ([00] 00 00 01) or access unit delimiters.

    MPEG-2 Transport Streams or *.h264 raw contain start codes ([00] 00 00 01) and access code delimiters.

    The size field in MP4 is most of the time 4 bytes but if you want the correct answer you have parse the codec private data (SPS/PPS).

    In short H.264 comes in two formats:

    Annex-B (MPEG-2 TS, or *.264 raw file):

    <[00] 00 00 01> <NAL> <[00] 00 00 01> <NAL> ... <[00] 00 00 01> <NAL>
    

    MP4 (mdat):

    <size><NAL><size><NAL>...<size><NAL> 
    

    Your file in https://drive.google.com/file/d/1Vwcz8WsTuRLJie8SFzGspizyTc-caGjc/view?usp=sharing has video and audio in the same mdat.

    So to get the I-frame detection reliable you have to parse a little more:

    this gives you the video start into mdat:

    moof[i]->traf[0]->trun[0]->dataOffset
    

    audio starts here => stop parsing video

    moof[i]->traf[1]->trun[0]->dataOffset