Suppose I have an arbitrary block of bytes. The block is terminated with a CRC remainder computed over the whole block using the CRC-16-CCITT algorithm, where the remainder is arranged in the big-endian byte order. After the block and the remainder, there is an arbitrary number of zero bytes that continue until the end of the byte stream.
This arrangement takes advantage of a certain property of this CRC algorithm which is normally considered undesirable: it does not distinguish between messages with different numbers of trailing zeroes, provided that the message is terminated with its remainder (it is in my case). This allows the receiver to assert the correctness of the data regardless of the number of trailing bytes in the stream.
Here is an example:
>>> hex(crc(b'123456789')) # Computing the remainder
'0x29b1'
>>> hex(crc(b'123456789\x29\xb1')) # Appending the remainder in the big-endian order
'0x0' # If the remainder is correct, the residual value is always zero
>>> hex(crc(b'123456789\x29\xb1\x00\x00')) # ...and it is invariant to the number of trailing zeros
'0x0'
>>> hex(crc(b'123456789\x29\xb1\x00\x00\x00'))
'0x0'
This is the desired behavior in my case. However, in my application the data are exchanged over a medium that utilizes a non-return-to-zero (NRZ) encoding: the medium layer injects a single stuff bit after every five consecutive data bits of the same level, where the polarity of the stuff bit is the opposite of the preceding bits; e.g. the value of 00000000
is transmitted as 000001000
. Bit stuffing is highly undesirable because it adds overhead.
In order to take advantage of the invariance of the CRC algorithm to the trailing data (which is used for padding) and yet avoid bit stuffing, I intend to xor every data byte with 0x55 (although it could be any other bit pattern that avoids stuffing) before updating the CRC remainder, and then xor the final remainder with 0x5555.
For reference, here is the standard CRC-16-CCITT algorithm, naive implementation:
def crc16(b):
crc = 0xFFFF
for byte in b:
crc ^= byte << 8
for bit in range(8):
if crc & 0x8000:
crc = ((crc << 1) ^ 0x1021) & 0xFFFF
else:
crc = (crc << 1) & 0xFFFF
return crc
And here is my modification which xors inputs and outputs with 0x55:
def crc16_mod(b):
crc = 0xFFFF
for byte in b:
crc ^= (byte ^ 0x55) << 8
for bit in range(8):
if crc & 0x8000:
crc = ((crc << 1) ^ 0x1021) & 0xFFFF
else:
crc = (crc << 1) & 0xFFFF
return crc ^ 0x5555
A simple check confirms that the modified algorithm behaves as intended:
>>> print(hex(crc16_mod(b'123456789'))) # New remainder
0x954f
>>> print(hex(crc16_mod(b'123456789\x95\x4f'))) # Appending the remainder; residual is 0x5555
0x5555
>>> print(hex(crc16_mod(b'123456789\x95\x4f\x55\x55\x55'))) # Invariant to the number of trailing 0x55
0x5555
>>> print(hex(crc16_mod(b'123456789\x95\x4f\x55\x55\x55\x55'))) # Invariant to the number of trailing 0x55
0x5555
My question is as follows: am I compromising error-detecting properties of the algorithm by introducing this modification? Are there any other downsides I should be aware of?
Under the standard model of errors (bits flipped independently with a fixed probability), there is no downside. It is difficult to anticipate practical difficulties.