ffmpegvideo-processingip-camera

Hikvision NVR video conversion ffmpeg


I have a Hikvision NVR that stores security camera footage that I need to display on a website. I know that Hikvision uses proprietary H.264 codec that makes it impossible to play (coherently) in popular video players, like VLC, unless you install that codec everywhere you play it.

My plan was to transcode the video using ffmpeg to regular H.264 codec and AAC for audio but the produced file has the same issues as the original - no audio when playing and very disruptive video. So the question is, does ffmpeg support encoding from Hikvision video/audio codecs? Or perhaps should try to convert to different web-capable codecs using ffmpeg? My ffmpeg command looks like this:

ffmpeg -i C:\1.mp4  -c:v libx264 -preset fast -crf 30 -b:v 200k -c:a aac -strict experimental -movflags faststart -threads 0 C:\2.mp4

EDIT: What's interesting is that ffplay.exe opens and plays the original video files with no problem whatsoever, even on a computer where Hikvision codecs are not isntalled, therefore I figured conversion should be possible as well?

Mediainfo output of the video file in question:

General
CompleteName                     : C:\DownLoad\1.mp4
Format                           : MPEG-PS
FileSize/String                  : 8.60 MiB
Duration/String                  : 2 h 7 min
OverallBitRate/String            : 9 395 b/s
FileExtension_Invalid            : mpeg mpg m2p vob pss evo

Video
ID/String                        : 224 (0xE0)
Format                           : AVC
Format/Info                      : Advanced Video Codec
Format_Profile                   : Baseline@L4
Format_Settings                  : 1 Ref Frames
Format_Settings_CABAC/String     : No
Format_Settings_RefFrames/String : 1 frame
Format_Settings_GOP              : M=1, N=30
Duration/String                  : 2 min 0 s
Width/String                     : 1 920 pixels
Height/String                    : 1 080 pixels
DisplayAspectRatio/String        : 16:9
FrameRate_Mode/String            : Variable
ColorSpace                       : YUV
ChromaSubsampling/String         : 4:2:0
BitDepth/String                  : 8 bits
ScanType/String                  : Progressive

Audio
ID/String                        : 192 (0xC0)
Format                           : MPEG Audio
Duration/String                  : 2 h 7 min
Compression_Mode/String          : Lossy
Video_Delay/String               : -33 min 40 s

Output of ffmpeg:

C:\ffmpeg\bin>ffmpeg -i C:\DownLoad\1.mp4  -c:v libx264 -preset fast -crf 30 -b:v 75k -c:a aac -strict experimental -movflags faststart -threads 0 C:\DownLoad\2.mp4
ffmpeg version N-86537-gae6f6d4 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 7.1.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-zlib
  libavutil      55. 66.100 / 55. 66.100
  libavcodec     57. 99.100 / 57. 99.100
  libavformat    57. 73.100 / 57. 73.100
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 94.100 /  6. 94.100
  libswscale      4.  7.101 /  4.  7.101
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Input #0, mpeg, from 'C:\DownLoad\1.mp4':
  Duration: 02:07:57.93, start: 789.820800, bitrate: 9 kb/s
    Stream #0:0[0x1e0]: Video: h264 (Baseline), yuv420p(progressive), 1920x1080, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x1c0]: Audio: pcm_mulaw, 8000 Hz, mono, s16, 64 kb/s
File 'C:\DownLoad\2.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (pcm_mulaw (native) -> aac (native))
Press [q] to stop, [?] for help
[aac @ 0000000002cd0280] Too many bits 8832.000000 > 6144 per frame requested, clamping to max
[libx264 @ 0000000002514c80] using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX XOP FMA4
[libx264 @ 0000000002514c80] profile High, level 4.0
[libx264 @ 0000000002514c80] 264 - core 150 r2833 df79067 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=30 rc=crf mbtree=1 crf=30.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'C:\DownLoad\2.mp4':
  Metadata:
    encoder         : Lavf57.73.100
    Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 1920x1080, q=-1--1, 75 kb/s, 25 fps, 12800 tbn, 25 tbc
    Metadata:
      encoder         : Lavc57.99.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/75000 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) ([64][0][0][0] / 0x0040), 8000 Hz, mono, fltp, 48 kb/s
    Metadata:
      encoder         : Lavc57.99.100 aac
[mp4 @ 00000000010e9e00] Starting second pass: moving the moov atom to the beginning of the file speed= 116x
frame= 3269 fps= 66 q=-1.0 Lsize=   11086kB time=01:34:24.38 bitrate=  16.0kbits/s dup=269 drop=0 speed= 115x
video:10429kB audio:592kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.594114%
[libx264 @ 0000000002514c80] frame I:14    Avg QP:21.86  size: 59795
[libx264 @ 0000000002514c80] frame P:833   Avg QP:24.81  size:  8993
[libx264 @ 0000000002514c80] frame B:2422  Avg QP:28.70  size:   970
[libx264 @ 0000000002514c80] consecutive B-frames:  1.0%  0.2%  1.4% 97.4%
[libx264 @ 0000000002514c80] mb I  I16..4: 18.9% 66.3% 14.8%
[libx264 @ 0000000002514c80] mb P  I16..4:  4.0%  7.7%  0.4%  P16..4: 16.2%  2.0%  0.6%  0.0%  0.0%    skip:69.1%
[libx264 @ 0000000002514c80] mb B  I16..4:  0.6%  0.2%  0.0%  B16..8:  5.5%  0.1%  0.0%  direct: 0.7%  skip:92.9%  L0:44.0% L1:55.0% BI: 1.0%
[libx264 @ 0000000002514c80] 8x8 transform intra:59.0% inter:83.3%
[libx264 @ 0000000002514c80] coded y,uvDC,uvAC intra: 25.3% 36.1% 7.7% inter: 1.0% 2.3% 0.1%
[libx264 @ 0000000002514c80] i16 v,h,dc,p: 23% 24% 43% 10%
[libx264 @ 0000000002514c80] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 37% 26% 23%  2%  2%  3%  2%  3%  3%
[libx264 @ 0000000002514c80] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 43% 23% 12%  4%  4%  5%  4%  4%  2%
[libx264 @ 0000000002514c80] i8c dc,h,v,p: 81%  7%  9%  3%
[libx264 @ 0000000002514c80] Weighted P-Frames: Y:1.0% UV:0.0%
[libx264 @ 0000000002514c80] ref P L0: 73.6% 26.4%
[libx264 @ 0000000002514c80] ref B L0: 80.9% 19.1%
[libx264 @ 0000000002514c80] ref B L1: 90.0% 10.0%
[libx264 @ 0000000002514c80] kb/s:653.30
[aac @ 0000000002cd0280] Qavg: 64512.656

C:\ffmpeg\bin>

Download link to sample:

https://www.dropbox.com/s/9ccptsuiqk2ntsv/1.zip?dl=0

This sample is exactly 2 minutes long, but VLC will tell you otherwise.


Solution

  • I was able to produce a normalized video file by doing the following:

    1. Extracting audio stream from my MPEG-PS video file using ffmpeg and using -acodec aac
    2. Removing audio stream from the original MPEG-PS video file using ffmpeg and -v:c copy and using -t option to specify the actual duration of the video
    3. Merging the two files together

    Result is a file that is playable in any video player. Tested on VLC, MPC-HC.

    edit:20180730

    Since then I have had multiple other issues with the same video sources and in the end decided to re-encode both video and audio tracks to get a normalized output. One of the main problems was difference in duration of video and audio tracks when I separated them from the original file - sometimes the audio would be 7-15 seconds longer than the video and sometimes it would be shorter. And sometimes, the video would have extra time of unknown duration appended to it for no apparent reason. To solve this issue I had to re-encode both audio and video tracks based on which one needed correction. (note: I knew the real time of the video, since I would manually request the exact chunks that I needed from the Hikvision NVR using its Web interface) So here is the logic of C# code that I came up with:

    Split the input.mp4 file into video and audio tracks using ffmpeg:

    ffmpeg -y -i 1.mp4 -vn -c:a libmp3lame -ar 44100 -aq 0 2-a.mp3
    ffmpeg -y -i 1.mp4 -an -c:v copy 2-v.mp4
    

    Note: I encode the audio into libmp3lame since Hikvision devices use G.711 PCM for audio in their mp4 container and that was not suitable for me.

    Get the durations of the video and audio tracks as ffmpeg identifies them using ffprobe:

    ffprobe -show_entries stream=duration -of compact -v 0 2-a.mp3
    ffprobe -show_entries stream=duration -of compact -v 0 2-v.mp4
    

    The durations are shown in the output of these two commands and I capture this output and filter it to get that particular string. Alternatively you can just manually take note of it if you do not plan to automate this whole process.

    Compare these durations to the actual duration and act accordingly:

    If the audio duration matches the actual one but the video duration is bigger - shrink the video track using ffmpeg and setpts filter like this:

    ffmpeg -y -i 2-v.mp4 -filter:v setpts=RATIO*PTS 2-v-edit.mp4
    

    Where RATIO is a number you get by dividing the audio track's duration by the video track's duration. For example, if video duration is: 45.11 seconds and audio duration is 39.76 seconds then RATIO = 39.76 / 45.11 = 0.8814010197 And PTS is the current PTS of the video track that ffmpeg inputs itself, this string is part of the command and not something you need to change.

    If the video duration matches the actual one, but the audio is shorter OR longer then I re-encode the audio using ffmpeg's atempo filter like this:

    ffmpeg -y -i 2-a.mp3 -acodec libmp3lame -filter:a atempo=RATIO 2-a-edit.mp3
    

    Where RATIO is audio duration / video duration.

    After this I get normalized video and audio tracks that I can merge using ffmpeg like this for example:

    ffmpeg -i 2-v-edit.mp4 -i 2-a-edit.mp3 -c copy 2.mp4
    

    If given a choice, I would never work with another Hikvision device in my life.