pythonpython-requestsvideo-streamingfile-descriptorffprobe

Python Requests with stream=True not working as supposed when using pipe mechanic. Trying to get distant video duration


I am trying to know the duration of a distant video file (say mp4).

I know already how to get the duration of a local video file:

import xml.etree.ElementTree as eltt, subprocess as spr

def size_from_fn(file_name):
    size = eltt.fromstring(
        spr.run(["ffprobe",
            "-i", file_name,
            "-show_format", "-output_format", "xml"
            ], stdout = spr.PIPE, stderr = spr.STDOUT).stdout.decode()
        ).find("format").get("duration")
    return size

def size_from_fd(file_descriptor):
    size = eltt.fromstring(
        spr.run(["ffprobe",
            "-i", "pipe:0",
            "-show_format", "-output_format", "xml"
            ], stdin = file_descriptor, stdout = spr.PIPE, stderr = spr.STDOUT).stdout.decode()
        ).find("format").get("duration")
    return size

def size_from_data(file_name):
    size = eltt.fromstring(
        spr.run(["ffprobe",
            "-i", "pipe:0",
            "-show_format", "-output_format", "xml"
            ], input = data, stdout = spr.PIPE, stderr = spr.STDOUT).stdout.decode()
        ).find("format").get("duration")
    return size

All work perfectly

Also I know how to get an HTTP request as a file descriptor:

import requests as rq

def url_to_fd(url):
    req = rq.get(url, stream = True)
    return req.raw

It also works

However the combination of the two fails with the message from ffprobe : Invalid data found when processing input

I have no idea why, I just know the returned file descriptor from URL has the difference of not being seekable (one-way reading) but by replacing this method of a normal file descriptor:

with open("test.mp4", "rb") as f:
    f.seek = None
    size_of_fd(f)

this works and thus shows that ffprobe doesn't use any seeking

Also doing this works so I don't know what is up:

def get_duration(url):
    complete_data = url_to_fd(url).read()
    return size_of_data(complete_data)

My problem is that video files may be arbitrarily large so I can't afford to download the whole video.

Test video URL


Solution

  • As @tepalia already mentioned you can use URL directly in ffprobe

    I only add that you can also add other parameteres to get directly duration

    ffprobe https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4  \
    -v quiet \
    -show_entries format=duration \
    -output_format default=noprint_wrappers=1:nokey=1
    

    Result:

    596.474195
    

    -v quiet - removes information about program, libraries, etc.

    -show_entries format=duration - shows only duration as [FORMAT]duration=596.474195[/FORMAT]

    -output_format default=noprint_wrappers=1:nokey=1 - removes wrapper [FORMAT][/FORMAT] and key duration=


    If you need other values at the same time (ie. size) then you can use format=duration,size and every value will be in separate line

    596.474195
    158008374
    

    There is also -hide_banner similar to -v quiet which hides only part of information.
    (-v means -verbose and it works also with other values - ie -v debug)


    You can find more information on similar portal SuperUser or VideoProduction

    For example: ffmpeg - How to get video duration in seconds? - Super User


    Ther is also ffprobe Documentation