pythonffmpegsubprocess

splitting video into shots with ffmpeg from within python


After some searching, the following command works to split my input video into shots, and save the first frame from each shot.

ffmpeg -i input_vid.mp4 -filter:v "select='gt(scene,0.1)',showinfo" -vsync 0 frames/%05d.jpg

When run from the terminal, it prints a line to what I think is stderr for each of the extracted frames, like so

Parsed_showinfo_1 @ 0x7f0198003540] n: 884 pts: 132552 pts_time:2209.2  duration: 2 duration_time:0.0333333 fmt:yuv420p cl:left sar:0/1 s:480x360 i:P iskey:0 type:P checksum:F2D8FCAA plane_checksum:[2D2F32C1 6E0164EC 832064FD] mean:[16 128 128] stdev:[0.0 0.0 0.0]

I want to run this from Python, and also capture the output so I can get the timestamps of the extracted frames (e.g. pts_time:2209.2 in the example above). But when try it in a subprocess.run, I get the following error, and no files are written to frames/.

>> FFMPEG_PATH = imageio_ffmpeg.get_ffmpeg_exe()
>> x = subprocess.run([FFMEPG_PATH, "-i", "input_vid.mp4", '-filter:v ' "select=" "'gt(scene,0.1)'" ",showinfo", "-vsync", "0", r"frames/%05d.jpg"], capture_output=True)
>> print(x.stderr.decode())
  ffmpeg version 4.2.2-static https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input_vid.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2013-06-21T09:09:58.000000Z
  Duration: 00:37:10.88, start: 0.000000, bitrate: 394 kb/s
    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 480x360, 296 kb/s, 30 fps, 30 tbr, 60 tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
    Metadata:
      creation_time   : 2013-06-21T09:10:07.000000Z
      handler_name    : IsoMedia File Produced by Google, 5-11-2011
[NULL @ 0x58e6e80] Unable to find a suitable output format for '0'
0: Invalid argument

If I remove the filter argument, it does extract frames, but there are too many, it's important to be able to specify a threshold gt(scene,0.1). It also doesn't capture any output that tells me the timestamps:

>> x = subprocess.run([FFMEPG_PATH, "-i", "input_vid.mp4", "-vsync", "0", "frames2/%05d.jpg"], capture_output=True)
>> print(x.stderr.decode())
ffmpeg version 4.2.2-static https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input_vid.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2013-06-21T09:09:58.000000Z
  Duration: 00:37:10.88, start: 0.000000, bitrate: 394 kb/s
    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 480x360, 296 kb/s, 30 fps, 30 tbr, 60 tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
    Metadata:
      creation_time   : 2013-06-21T09:10:07.000000Z
      handler_name    : IsoMedia File Produced by Google, 5-11-2011
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> mjpeg (native))
Press [q] to stop, [?] for help
[swscaler @ 0x6af2200] deprecated pixel format used, make sure you did set range correctly
Output #0, image2, to 'frames2/%05d.jpg':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    encoder         : Lavf58.29.100
    Stream #0:0(und): Video: mjpeg, yuvj420p(pc), 480x360, q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc58.54.100 mjpeg
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
frame=66922 fps=1571 q=24.8 Lsize=N/A time=00:37:10.73 bitrate=N/A speed=52.4x    
video:310586kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

>> print(x.stdout.decode())

It seems like there are two problems: something to do with the filter argument, and capturing the output that contains the timestamps. I've tried various ways of writing the nested string quotes in the filter argument: triple quotes, escaping, changing the order of double and single quotes, and concatenating separate strings like shown above.


Solution

  • You didn't split the filter arguments correctly:

    ..., '-filter:v ' "select=" "'gt(scene,0.1)'" ",showinfo", ...
    

    This should be

    "-filter:v", "select='gt(scene,0.1)',showinfo",...