ffmpegh.264rtp

Is ffmpeg broken for h.264 RTP Output?


I used wireshark to capture the RTP stream sent with:

ffmpeg -f lavfi -i "testsrc=duration=5:size=cif:rate=25" -pix_fmt yuv420p -g 25 -bf 2 -an -c:v libx264 -f rtp rtp://127.0.0.1:1234 > play.sdp

ffmpeg -version ffmpeg version git-2020-03-15-c467328 Copyright (c) 2000-2020 the FFmpeg developers

As can be seen in bold, RTP timestamps go forward and backward. I expect them to be the same for every packet in the frame and then only go forward by 40ms (+3600 at 90khz clock) as per the H.264/RTP spec.

Also, according to that spec, the last packet in a frame should have its marker-bit set but here almost all the packets have this bit set.

Am I doing something wrong? Not understanding something? Or is ffmpeg support for writing H.264 RTP simply broken?

SSRC=0xA49C3DC9, Seq=3595, Time=3153114809
SSRC=0xA49C3DC9, Seq=3596, Time=3153114809
SSRC=0xA49C3DC9, Seq=3597, Time=3153114809
SSRC=0xA49C3DC9, Seq=3598, Time=3153114809, Mark
SSRC=0xA49C3DC9, Seq=3599, Time=3153125609, Mark
SSRC=0xA49C3DC9, Seq=3600, Time=3153118409, Mark
SSRC=0xA49C3DC9, Seq=3601, Time=3153122009, Mark
SSRC=0xA49C3DC9, Seq=3602, Time=3153136409, Mark
SSRC=0xA49C3DC9, Seq=3603, Time=3153129209, Mark
SSRC=0xA49C3DC9, Seq=3604, Time=3153132809, Mark
SSRC=0xA49C3DC9, Seq=3605, Time=3153147209, Mark
SSRC=0xA49C3DC9, Seq=3606, Time=3153140009, Mark
SSRC=0xA49C3DC9, Seq=3607, Time=3153143609, Mark
SSRC=0xA49C3DC9, Seq=3608, Time=3153158009, Mark
SSRC=0xA49C3DC9, Seq=3609, Time=3153150809, Mark
SSRC=0xA49C3DC9, Seq=3610, Time=3153154409, Mark
SSRC=0xA49C3DC9, Seq=3611, Time=3153168809, Mark
SSRC=0xA49C3DC9, Seq=3612, Time=3153161609, Mark
SSRC=0xA49C3DC9, Seq=3613, Time=3153165209, Mark
SSRC=0xA49C3DC9, Seq=3614, Time=3153179609, Mark
SSRC=0xA49C3DC9, Seq=3615, Time=3153172409, Mark
SSRC=0xA49C3DC9, Seq=3616, Time=3153176009, Mark
SSRC=0xA49C3DC9, Seq=3617, Time=3153190409, Mark
SSRC=0xA49C3DC9, Seq=3618, Time=3153183209, Mark

The RTP specification, defined in RFC 3550, states that "the timestamp reflects the sampling instant of the first octet in the RTP data packet. The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations" (Section 5.1).


Solution

  • The spec RFC 6184 says for the Marker bit,

    Set for the very last packet of the access unit indicated by the RTP timestamp
    

    The encoder is encoding one frame per AU so not broken here.

    The timestamps are non-monotonic because you have enabled B-frames. B-frames are displayed before any referenced P-frame but encoded after it during encoding and emitted in encoding order. Set -bf 0 to disable B-frames and have monotonic PTS.