algorithmcheck-digit

Extending the Damm algorithm to base-32


I'd like to use the Damm algorithm to generate check digits for codes with a 32-character alphabet. The algorithm itself is easily applied to any base (except 2 or 6). The difficulty is the necessary look-up table, which must be a totally anti-symmetric quasigroup with a single character (usually 0) down the main diagonal.

The Wikipedia page gives a table for base 10, and the Python implementation gives a table for base-16, but I haven't found a base-32 example. Does anyone have a suitable table for base-32?


Solution

  • Inspired by David Eisenstat's answer (and the original dissertation by Damm that he cites), here's a simple Python routine to calculate / verify a Damm checksum for any base 2n for 2 ≤ n ≤ 32:

    # reduction bitmasks for GF(2^n), 2 <= n <= 32
    masks = (3, 3, 3, 5, 3, 3, 27, 3, 9, 5, 9, 27, 33, 3, 43, 9,
             9, 39, 9, 5, 3, 33, 27, 9, 27, 39, 3, 5, 3, 9, 141)
    
    # calculate Damm checksum for base 2^n
    def damm2n (digits, n):
        modulus = (1 << n)
        mask = modulus | masks[n - 2]
        checksum = 0
        for digit in digits:
            checksum ^= digit
            checksum <<= 1
            if checksum >= modulus: checksum ^= mask
        return checksum
    

    The routine takes a list (or, more generally, an iterable) of digits, which are assumed to be integers in the range 0 to 2n−1 inclusive, and the number n of bits per digit (assumed to be in the range 2 to 32 inclusive).

    The totally asymmetric quasigroup used by this implementation of the Damm algorithm is given by the map (a, b) ↦ 2 ⊗ (ab), where ⊕ denotes addition in the finite field GF(2n) (which is simply bitwise XOR), ⊗ denotes multiplication in GF(2n), and 2 denotes the element represented by the bitstring 0...0102 in the usual n-bit representation of GF(2n).

    This is equivalent to the map (a, b) ↦ (2a) ⊕ b given by Damm in Example 5.2 of his thesis, except that the input digits are b permuted (by multiplying them with 2 in GF(2n)) to ensure that (a, a) ↦ 0 for all a. This is equivalent to permuting the columns of the quasigroup operation table so that the diagonal is all zeros, and allows the checksum to be verified simply by appending it to the original input and checking that the new checksum of the extended input is zero.

    The GF(2n) multiplication by 2 is implemented using the usual trick of shifting left by one and, if the n-th bit of the result is set, XORing it with a bitmask corresponding to a monic irreducible polynomial of order n. The specific bitmasks used are taken from the Table of Low-Weight Binary Irreducible Polynomials by Gadiel Seroussi (1998). If you (for some reason) need checksums for bases larger than 232, their table goes up to a whopping 210,000. The Seroussi table lists the exponents of the non-zero coefficients of each reduction polynomial, excluding the constant term; the bitmasks in the code above are obtained by discarding the highest exponent (which is always n), summing together 2k for the other exponents k and adding 1. (Thus, for example, the entry "8,4,3,1" for n = 8 yields the mask 24 + 23 + 21 + 1 = 16 + 8 + 2 + 1 = 27.)

    In particular, the code above yields, for n = 4, results matching the base-16 Damm checksum implementation by Johannes Spielmann. This is not guaranteed in general, since there are many possible ways to implement a Damm checksum for a given base, but in this case, the quasigroups used by the two implementations happen to match.


    Ps. Here's some Python code to print out the look-up table in a format similar to what the Wikipedia article uses. (Thanks to CJM for the initial version.)

    alphabet = '0123456789ABCDEFGHJKLMNPQRTUVWXY' # avoids easy-to-confuse characters
    bits = 5
    
    # find out which first single-digit input gives which checksum
    r = [-1] * 2**bits
    for i in range(2**bits): r[damm2n([i], bits)] = i
    
    # print header
    print '  |',  ' '.join(alphabet)
    print '--+' + '--' * len(alphabet)
    
    # print rest of table    
    for i in range(2**bits):
        row = (alphabet[damm2n([r[i], j], bits)] for j in range(2**bits))
        print alphabet[i], '|', ' '.join(row)