python video ffmpeg video-processing pyav

Creating video from images using PyAV

I am trying to write a function that creates a new MP4 video from a set of frames taken from another video. The frames will be given in PIL.Image format and is often cropped to include only a part of the input video, but all images will have the same dimension.

What I have tried:

def modify_image(img):
    return img

test_input = av.open('input_vid.mp4')
test_output =av.open('output_vid.mp4', 'w')

in_stream = test_input.streams.video[0]
out_stream = test_output.add_stream(template=in_stream)

for frame in test_input.decode(in_stream):
    img_frame = frame.to_image()

    # Some possible modifications to img_frame...
    img_frame = modify_image(img_frame)

    out_frame = av.VideoFrame.from_image(img_frame)
    out_packet = out_stream.encode(out_frame)
    print(out_packet)

     
test_input.close()
test_output.close()

And the error that I got:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[23], line 11
      8     img_frame = frame.to_image()
     10     out_frame = av.VideoFrame.from_image(img_frame)
---> 11     out_packet = out_stream.encode(out_frame)
     12     print(out_packet)
     15 test_input.close()

File av\stream.pyx:153, in av.stream.Stream.encode()

File av\codec\context.pyx:490, in av.codec.context.CodecContext.encode()

File av\frame.pyx:52, in av.frame.Frame._rebase_time()

ValueError: Cannot rebase to zero time.

I followed the answer given in How to create a video out of frames without saving it to disk using python?, and met with the same issue.

Comparing the original VideoFrame and the VideoFrame created from the image, I found that the pts value of the new frames are saved as None instead of integer values. Overwriting the pts value of the new frame with the original values still causes the same error, and overwriting the dts value of the new frame gives the following error:

AttributeError: attribute 'dts' of 'av.frame.Frame' objects is not writable

Is there a way to modify the dts value, or possibly another method to create a video from a set of PIL.Image objects?

Solution

Using add_stream(template=in_stream) is only documented in the Remuxing example.
It's probably possible to use template=in_stream when re-encoding, but we have to set the time-base, and set the PTS timestamp of each encoded packet.
I found a discussion here (I didn't try it).

Instead of using template=in_stream, we may stick to the code sample from my other answer, and copy few parameters from the input stream to the output stream.

Example:

in_stream = test_input.streams.video[0]
codec_name = in_stream.codec_context.name  # Get the codec name from the input video stream.
fps = in_stream.codec_context.rate  # Get the framerate from the input video stream.
out_stream = test_output.add_stream(codec_name, str(fps))
out_stream.width = in_stream.codec_context.width  # Set frame width to be the same as the width of the input stream
out_stream.height = in_stream.codec_context.height  # Set frame height to be the same as the height of the input stream
out_stream.pix_fmt = in_stream.codec_context.pix_fmt  # Copy pixel format from input stream to output stream
#stream.options = {'crf': '17'}  # Select low crf for high quality (the price is larger file size).

We also have to "Mux" the video frame:

test_output.mux(out_packet)

At the end, we have to flush the encoder before closing the file:

out_packet = out_stream.encode(None)
test_output.mux(out_packet)

Code sample:

import av

# Build input_vid.mp4 using FFmpeg CLI (for testing):
# ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1:duration=100 -vcodec libx264 -crf 10 -pix_fmt yuv444p input_vid.mp4

test_input = av.open('input_vid.mp4')
test_output = av.open('output_vid.mp4', 'w')

in_stream = test_input.streams.video[0]
#out_stream = test_output.add_stream(template=in_stream)  # Using template=in_stream is not working (probably meant to be used for re-muxing and not for re-encoding).

codec_name = in_stream.codec_context.name  # Get the codec name from the input video stream.
fps = in_stream.codec_context.rate  # Get the framerate from the input video stream.
out_stream = test_output.add_stream(codec_name, str(fps))
out_stream.width = in_stream.codec_context.width  # Set frame width to be the same as the width of the input stream
out_stream.height = in_stream.codec_context.height  # Set frame height to be the same as the height of the input stream
out_stream.pix_fmt = in_stream.codec_context.pix_fmt  # Copy pixel format from input stream to output stream
#stream.options = {'crf': '17'}  # Select low crf for high quality (the price is larger file size).

for frame in test_input.decode(in_stream):
    img_frame = frame.to_image()
    out_frame = av.VideoFrame.from_image(img_frame)  # Note: to_image and from_image is not required in this specific example.
    out_packet = out_stream.encode(out_frame)  # Encode video frame
    test_output.mux(out_packet)  # "Mux" the encoded frame (add the encoded frame to MP4 file).
    print(out_packet)

# Flush the encoder
out_packet = out_stream.encode(None)
test_output.mux(out_packet)

test_input.close()
test_output.close()