sumsha256checksumshasha512

shasum --check says ok for multiple algorithms?


I wanted to have an easy one line command to check software I downloaded matches the checksum. I found this command here:

echo "68001338a60fca58e60e3f8dcff122954443afa984a0d766eea9c3b9b9b151d3783e7fd5e3fd8794c5839d7dc8d457e62057b009fc27d76b97d957903ef8641a  clonezilla-live-3.0.0-26-amd64.zip" | shasum --check -a 512

This produces an OK result. But if I change the algorithm to 256, it also says ok. Why is that? I was using the clonezilla checksums provided, and the 256 checksum is clearly not the same as 512. But it still says OK. If I manually change the checksum in the command it will fail, so it seems like it should be working. Does shasum do some magic behind the scenes to pick the right algorithm even though I specified a different one?


Solution

  • YES. It's actually a perl script, so you can easily see for yourself. In sub verify, after some setup, for each line it starts by doing:

                    if (/^[ \t]*\\?SHA/) {
                            $modesym = '*';
                            ($bslash, $alg, $fname, $sum) =
                            /^[ \t]*(\\?)SHA(\S+) \((.+)\) = ([\da-fA-F]+)/;
                            $alg =~ tr{/}{}d if defined $alg;
                    }
                    else {
                            ($bslash, $sum, $modesym, $fname) =
                            /^[ \t]*(\\?)([\da-fA-F]+)[ \t]([ *^U])(.+)/;
                            $alg = defined $sum ? $len2alg{length($sum)} : undef;
                    }
    

    In case you don't know perl, variable names usually begin with $ but in certain cases @ or %, and /.../ contains a regexp which when used in if( ) simply returns true if the current data item (here a line from the checksum file) matches, and when used in an assignment like ($a,$b,$c,$d) = /.../ it parses said data item and returns the 'capture groups' marked in the regexp by unbackslashed parentheses, for assignment to the respective variables.

    The first branch handles the format used by BSD cksum/md5/sha1/etc which states the algorithm name, then the filename in parentheses (in a regexp, backslashed parentheses are data characters), a spaced equal sign, and the hash value. The second branch handles GNU format, which is your case, and it determines the hash algorithm based on the length of the value using the map* len2alg which was defined as:

    my %len2alg = (40 => 1, 56 => 224, 64 => 256, 96 => 384, 128 => 512);
    $len2alg{56} = 512224 if $alg == 512224;
    $len2alg{64} = 512256 if $alg == 512256;
    

    i.e. it determines the algorithm from the length of the hash value, except that the lengths corresponding to SHA-224 and SHA-256 are 'shared' with SHA-512/224 and SHA-512/256 so it defaults to the former and for the latter you must use -a which was processed earlier to set $alg. This is noted on the man page:

     When verifying SHA-512/224 or SHA-512/256 checksums, indicate the
     algorithm explicitly using the -a option, e.g.
    
       shasum -a 512224 -c checksumfile
    

    (although it doesn't say this is only needed for GNU format, not BSD format) which by exceptio probat implies that to verify hashes other than those two you don't need -a.

    * Actually perl usually calls @x = (1,2,3) an array and %y = (1=>9, 2=>8, 3=>7) a hash, but we're using the other meaning of hash here and I wanted to avoid adding to the confusion.