checksumcrccrc32crc64

whole file CRC computation undetected error probability


I have read some papers that the probability of a CRC code being undetected is not depended on the message size and it is only related to CRC bits. 2^(-32) for 32bit CRC

My questions are:

  1. why we need wider CRCs? even if we plan to use a 16-bit CRC over the entire file, the probability of having an undetected error is almost zero and we can detect all the errors in the file.
  2. What is the need that when using 32bit CRC, we need to have a file size smaller than 2 ^ 32 (512 MB) does it mean that if we have a burst error causes more than 512 MB of file changes, the CRC can't detect it?

Solution

    1. The probability of detecting a random pattern of errors for a 16 bit CRC is about 2^(-16) (1/65536). A 32 bit CRC reduces this to 2^(-32) (1 / 4 billion).

    2. All CRCs will detect a single bit error no matter how large the file is. If the goal is a 32 bit CRC that is guaranteed to detect any 2 bit error pattern, the maximum file size + CRC is 2^32-1 bits. If the size including CRC is >= 2^32 bits, then if a 2 bit error occurs at bit[i+0] and bit[i+2^32-1], the error will not be detected. If the goal is a CRC that detects all 3 bit errors, which is normally done by having at least 2 prime polynomial factors in the CRC, one of them being (x+1), which will detect any odd number of bit errors, and a 31 bit factor which will detect any 2 bit error if file size + CRC is <= 2^31-1 bits. As the number of errors that a CRC is guaranteed to correct increases, the maximum file size + CRC decrease. Take a look at the table in "CRC Zoo". It is a list of CRC polynomials, followed by the maximum number of data bits (not including the CRC) for detecting all 2, 3, 4, 5, ... bit errors (Hamming distance 3, 4, 5, 6, ...).

    https://users.ece.cmu.edu/~koopman/crc/crc32.html

    Although not asked, another issue is the probability of transmitted or written data having no errors. This depends on the error rate and the size of the data. If the error rate for a byte is e, then the probability of zero errors is having no errors in all bytes, or (1 - e)^n, where n is the number of bytes. To deal with situations where the probability of errors is significant, then some type of error correction code is used to reduce the probability of an uncorrectable error.