linuxcrc32

Is it possible to make the linux cksum command zero out?


My goal is just to self-verify the consistency of a file without sending any additional files or signatures. I'd like to append a CRC at the end in a way that creates a predictable output from cksum on the command line. On the receiving side, I want the check to happen in a simple single line, something like:

$(cksum somefile | awk '{if ($1 == 0) print "pass"}')

Starting from this previous post in stackoverflow: Verification of a CRC checksum against zero_

I don't understand what I need to add at the end of the file to make it work out to 0, or if it's even possible to do it with cksum. According to this cksum man page, the length of the file is artificially appended to the end of the data.

... last octet, padded with zero bits (if necessary) to achieve an integral number of octets, followed by one or more octets representing the length of the file as a binary value, least significant octet first. The smallest number of octets capable of representing this integer shall be used.

Is there a way to manipulate the input data to make the cksum work out to a 0 value? Can it be done with one of the sha*sum or md5sum apps? I need it to be done with something that is pre-installed on standard Ubuntu 22. Unfortunately, crc32 isn't one of those.


Solution

  • Yes, it can be done. This C code will append four bytes to the input data, so that the POSIX cksum algorithm will give a zero check value on the output.

    // Force the result of the POSIX cksum command to be zero by appending four
    // bytes to the data.
    //
    // Placed into the public domain by Mark Adler, 17 Oct 2024.
    
    #include <stdio.h>
    #include <stdint.h>
    
    // Generate forward and reverse byte-wise CRC calculation tables for the CRC-32
    // used by the POSIX cksum command.
    static uint32_t forward[256], reverse[256];
    static void make_tables(void) {
        for (uint32_t n = 0; n < 256; n++) {
            uint32_t crc = n << 24;
            for (int k = 0; k < 8; k++)
                crc = crc & 0x80000000 ? (crc << 1) ^ 0x04c11db7 : crc << 1;
            forward[n] = crc;
            reverse[crc & 0xff] = (crc >> 8) ^ (n << 24);
        }
    }
    
    // Read the data from in. Write that data and four more bytes to out such that
    // the POSIX cksum command will give a zero check value for the output. Call
    // make_tables() once before using cksum_zero().
    static void cksum_zero(FILE *in, FILE *out) {
        // Compute the raw CRC of the data from stdin, crc. Write the data to out
        // and compute the number of input bytes, len.
        uint32_t crc = 0;
        uint64_t len = 0;
        int ch;
        while ((ch = getc(in)) != EOF) {
            crc = forward[(crc >> 24) ^ ch] ^ (crc << 8);
            putc(ch, out);
            len++;
        }
    
        // Add four to the length of the data for the four bytes that we will be
        // inserting after the data and before what cksum appends.
        len += 4;
    
        // Compute k, the number of bits of len that will be appended by cksum.
        uint64_t n = len;
        int k = 0;
        do {
            n >>= 8;
            k += 8;
        } while (n);
    
        // Compute a reverse CRC of the appended length preceded by four zero
        // bytes. The zeros is where the bytes we are calculating will go. We start
        // with the desired final raw CRC, 0xffffffff. cksum inverts the CRC at the
        // end, so this will result in cksum giving a check value of zero.
        uint32_t tail = 0xffffffff;
        do {
            k -= 8;
            tail = reverse[tail & 0xff] ^ (tail >> 8) ^ ((len >> k) << 24);
        } while (k);
        tail = reverse[tail & 0xff] ^ (tail >> 8);
        tail = reverse[tail & 0xff] ^ (tail >> 8);
        tail = reverse[tail & 0xff] ^ (tail >> 8);
        tail = reverse[tail & 0xff] ^ (tail >> 8);
    
        // Append the calculated four bytes to the data.
        crc ^= tail;
        putc(crc >> 24, out);
        putc(crc >> 16, out);
        putc(crc >> 8, out);
        putc(crc, out);
    }
    
    int main(void) {
        // Generate the CRC tables.
        make_tables();
    
        // Get the input data from stdin and write that data and the four appended
        // bytes to stdout.
        cksum_zero(stdin, stdout);
        return 0;
    }
    

    Python is installed on Ubuntu by default. This Python code computes the four bytes that need to be appended using the output of cksum:

    # Read the output of POSIX cksum from stdin and write four bytes to stdout to
    # be appended to the input of cksum. The result will have a zero cksum value.
    # This will then always give zero as the first number in the output:
    #     cksum inputfile | ./zerock.py | cat inputfile - | cksum
    
    # crc is the final raw CRC. Return what the raw CRC was before the low non-zero
    # bytes of len were applied, starting with the least significant byte. If len
    # is zero, then there are no non-zero bytes and the result is crc. The constant
    # below is the 32-bit CRC polynomial 0x04c11db7 rotated right one bit.
    def back(crc, len):
        bits = 0
        while ((len >> bits) != 0):
            bits += 8
        for i in range(bits - 8, -8, -8):
            for _ in range(8):
                crc = (crc >> 1) ^ 0x82608edb if (crc & 1) != 0 else crc >> 1
            crc ^= ((len >> i) & 0xff) << 24
        return crc
    
    import sys
    crc, len = map(int, sys.stdin.read().split()[:2])   # get cksum output
    crc = back(crc ^ 0xffffffff, len)           # undo original length to get CRC
    tail = back(0xffffffff, (len + 4) << 32)    # undo new length + four zero bytes
    sys.stdout.buffer.write((crc ^ tail).to_bytes(4, 'big'))
    

    The POSIX cksum algorithm is defined here.