pythonpython-3.xnumpyvideo-processingpyav

Reading a video directly into Numpy with PyAV (no iterations)


Is it possible to read a video directly into a 3D Numpy with PyAV? Currently, I am looping through each frame:

i = 0
container = av.open('myvideo.avi')
for frame in container.decode(video=0):
    if i == 0: V = np.array(frame.to_ndarray(format='gray'))
    else: V = np.dstack((V, np.array(frame.to_ndarray(format='gray'))))
    i += 1

The first frame defines a 2D Numpy array (i=0); each subsequent frame (i>0) is stacked onto the first array using np.dstack. Ideally, I would like to read the entire video into a 3D Numpy array of grayscale frames, all at once.


Solution

  • I couldn't find a solution using PyAV, and uses ffmpeg-python instead.

    ffmpeg-python is a Pythonic binding for FFmpeg like PyAV.

    The code reads the entire video into a 3D NumPy array of grayscale frames, all at once.

    The solution performs the following steps:

    Here is the code (please read the comments):

    import ffmpeg
    import numpy as np
    from PIL import Image
    
    in_filename = 'in.avi'
    
    """Build synthetic video, for testing begins:"""
    # ffmpeg -y -r 10 -f lavfi -i testsrc=size=160x120:rate=1 -c:v libx264 -t 5 in.mp4
    width, height = 160, 120
    
    (
        ffmpeg
        .input('testsrc=size={}x{}:rate=1'.format(width, height), r=10, f='lavfi')
        .output(in_filename, vcodec='libx264', t=5)
        .overwrite_output()
        .run()
    )
    """Build synthetic video ends"""
    
         
    # Use FFprobe for getting the resolution of the video frames
    p = ffmpeg.probe(in_filename, select_streams='v');
    width = p['streams'][0]['width']
    height = p['streams'][0]['height']
    
    # https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
    # Stream the entire video as one large array of bytes
    in_bytes, _ = (
        ffmpeg
        .input(in_filename)
        .video # Video only (no audio).
        .output('pipe:', format='rawvideo', pix_fmt='gray')  # Set the output format to raw video in 8 bit grayscale
        .run(capture_stdout=True)
    )
    
    n_frames = len(in_bytes) // (height*width)  # Compute the number of frames.
    frames = np.frombuffer(in_bytes, np.uint8).reshape(n_frames, height, width) # Reshape buffer to array of n_frames frames (shape of each frame is (height, width)).
    
    im = Image.fromarray(frames[0, :, :])  # Convert first frame to image object
    im.show()  # Display the image
    

    Output:
    enter image description here


    Update:

    Using PyAV:

    When using PyAV, we have to decode the video frame by frame.

    The main advantage of using PyAV over ffmpeg-python, is that we can use it without the present of FFmpeg CLI (without ffmpeg.exe in Windows).

    For reading all video frames into one NumPy array we may use the following stages:


    Code sample (uses OpenCV for showing the frames for testing):

    import av
    import numpy as np
    import cv2
    
    # Build input file using FFmpeg CLI (for testing):
    # ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1:duration=10 -vcodec libx264 -pix_fmt yuv420p myvideo.avi
    
    container = av.open('myvideo.avi')
    
    frames = []  # List of frames - store video frames after converting to NumPy array.
    
    for frame in container.decode(video=0):
        # Decode video frame, and convert to NumPy array in BGR pixel format (use BGR because it used by OpenCV).
        frame = frame.to_ndarray(format='bgr24')  # For Grayscale video, use: frame = frame.to_ndarray(format='gray')
        frames.append(frame)  # Append the frame to the list of frames.
    
    # Convert the list to NumPy array.
    # Shape of each frame is (height, width, 3) [for Grayscale the shape is (height, width)]
    # the shape of frames is (n_frames, height, width, 3)  [for Grayscale the shape is (n_frames, height, width)]
    frames = np.array(frames)
    
    # Show the frames for testing:
    for i in range(len(frames)):
        cv2.imshow('frame', frames[i])
        cv2.waitKey(1000)
    
    cv2.destroyAllWindows()