ffmpegmediasha1checksummplayer

How can i create a stable checksum of a media file?


how can i create a checksum of only the media data without the metadata to get a stable identification for a media file. preferably an cross platform approach with a library that has support for many formats. e.g. vlc, ffmpeg or mplayer.

(media files should be audio and video in common formats, images would be nice to have too)


Solution

  • Here is a shell script around mvik's ffmpeg-based answer which prints the MD5 in case of success, or the stderr output in case of failure.

    #!/bin/bash
    
    # Compute the MD5 of the audio stream of an MP3 file, ignoring ID3 tags.
    
    # The problem with comparing MP3 files is that a simple change to the ID3 tags
    # in one file will cause the two files to have differing MD5 sums.  This script
    # avoids that problem by taking the MD5 of only the audio stream, ignoring the
    # tags.
    
    # Note that by virtue of using ffmpeg, this script happens to also work for any
    # other audio file format supported by ffmpeg (not just MP3's).
    
    set -e
    
    stdoutf=$( mktemp mp3md5.XXXXXX )
    stderrf=$( mktemp mp3md5.XXXXXX )
    
    set +e
    ffmpeg -i "$1" -c:a copy -f md5 - >$stdoutf 2>$stderrf
    ret=$?
    set -e
    
    if test $ret -ne 0 ; then
        cat $stderrf
    else
        cat $stdoutf | sed 's/MD5=//'
    fi
    
    rm -f $stdoutf $stderrf
    exit $ret