Assume an H.264-encoded video stream is stored in an mp4 container. What is a straightforward way of detecting frame types and associating them with parts of the stored data?
I can extract the frame types using the below command. I would like to associate them with specific data segments (e.g. byte X to byte Y) so that I can apply different amounts of noise to I-, P-, and B-frames. At the moment, I'm using a simple Python script to flip random bits in the stored data at a fixed error rate, regardless of the frame type.
ffprobe -show_frames <filename>.mp4
As I later discovered, one way of identifying the bytes of each I-frame is by using ffprobe
and the pkt_pos
and pkt_size
parameters. The first represents the first byte of the observed frame, while their sum reduced by one gives the frame's last byte. In Python, a frame's bytes are then extracted using
with open(f'{name_root}.{name_extension}', 'rb') as f:
data = list(f.read())
frame = data[pkt_pos:pkt_pos+pkt_size]
Dealing with multiple frames makes matters more complicated. The below will display the positions of all I-frames
ffprobe -show_frames <filename>.mp4 | grep "=I" -B 18 -A 11 | grep "pkt_pos"
I decided to copy the output to a CSV which I then open in Python.
Note, pkt_pos
in the above can be substituted with pkt_size
or any other parameter.