I am setting up a Reed Solomon library to correct AND detect errors as the come in. For simplicity, let's look at a Reed Solomon configuration where
m(symbol size) : 8 [GF(256)]
k(user payload) : 2
2T(parity symbols): 2
Yielding a transmitted payload of 4 octets.
This can correct any 1 symbol error and, the purpose of this post, it can DETECT 2 symbol errors. There is limited literature on the error detection of RS, however just for a cursory source you can check the introduction in the Wikipedia article:
By adding t check symbols to the data, a Reed–Solomon code can detect any combination of up to and including t erroneous symbols, or correct up to and including ⌊t/2⌋ symbols.
However this doesn't seem to match my observations.
I have a library, primarily built from this article.
which, from what I can tell, works very well. I set up a exhaustive test surrounding our implementation and found a case where there are 2 symbol errors (which from what I understood should be DETECTABLE) however it is not. From what I understand, a simple check to see if an un-correctable error occurred which passes the normal checks (i.e. the error locator is valid, find errors has a valid error count, polynomial degree is valid) is to recalculate the syndromes for the corrected message. If the syndromes are non zero then we still have an error. However when I do this the syndromes are all 0, suggesting no error is found and that we have a collision between an error vector with 1 erroneous symbol, and an error vector with 2 erroneous symbols.
This is the test:
# Create Message
msg_in = [0x10,0x2f]
# Append RS FEC codes
msg_enc = rs_encode_msg(msg_in)
# Apply Error
errVec = [0,0x2b,0,0xea]
for i,err in enumerate(errVec):
msg[i] ^= err;
# Decode
# Syndromes
synd = rs_calc_syndromes(msg)
# Error Locator
err_loc = rs_find_error_locator(synd)
# Error Positions
pos = rs_find_errors(err_loc)
# Correct
msg = rs_correct_errata(msg, synd, pos, err_loc)
#Calculate syndromes again
newSynd = rs_calc_syndromes(msg)
Output:
Message
0x10 0x2f
Encoded Message
0x10 0x2f 0x1 0x3e
Encoded Message With Errors
0x10 0x4 0x1 0xd4
Syndromes
0xc1 0x46
Error Locator
0x8 0x1
Error Position
0x00 # The first position
Corrected Message
0xd1 0x4 0x1 0xd4
Recalculated Syndromes
0x0 0x0
If you are still reading, thank you. I know I have no provided the entire library, but I did provide the inputs, and outputs, and key variable values. What I want to know is if my understanding written above is wrong; that we can DETECT 2T symbol errors where 2T is the amount of added symbols. Because from this test case it seems there is a collision, I test this further by just calculating the syndromes across the following error vectors which further supports a collision and that Reed Solomon CANNOT NECESSARILY DETECT ALL errors up to 2T. Let me know if I am wrong.
error vector: 0xc1 0x0 0x0 0x0
yielding syndrome: 0xc1 0x46
and
error vector: 0x0 0x2b 0x0 0xea
yielding syndrome: 0xc1 0x46
Has a collision
With two parity symbols, the syndromes will only be unique for single symbol errors, which is why they can be used to correct single symbol errors. In the case of two symbol errors, the syndromes will be non-zero (one of them may be zero, but not both), but not always unique, for all combinations of two error locations and two error values (which is why two symbol errors can't be corrected if there are only two parity symbols).
With two parity symbols, the Hamming distance is 3 symbols. Every valid (zero syndromes == exact multiple of the generator polynomial) codeword differs by at least 3 symbols from every other valid code word, so no 2 symbol error case will appear to be a valid (zero syndrome) codeword.
It is possible for a 3 or more error case combination of error locations and values to produce syndromes == 0. The simplest example of this is to take a valid codeword (zero error message), and xor the 3 symbol generator polynomial anywhere within the message, which will be another valid codeword (an exact multiple of the generator polynomial).
In addition, there is a maximum length codeword. For BCH type Reed Solomon code, which is what you are using, for GF(2^n), it's (2^n)-1 symbols. If a message contains 2^n or more symbols (including the parity symbols), then a two error case with the same error value at message[i] and message[i + 2^n - 1] will produce syndromes of zero. For the original view type Reed Solomon code, the maximum length codeword is 2^n (one more symbol than BCH type), but it's rarely used since decoding operates on the entire message, while BCH decoding operates on syndromes.
Update - I forgot to mention that with two parity symbols, attempting to perform a one error correction on a two error message may end up causing a third error, which will result in a valid codeword (syndromes will be zero), but one that differs from the original codeword in three locations.
The probability of this happening is reduced if the codeword is shortened, as any calculated location that is not within range of the shortened codeword would be a detected error. If there are n symbols (including the parity symbols), then the probability of of a single error mis-correction based on the calculated location being within range is about n/255.
In this case, with a codeword size of 4 bytes, I found 3060 out of a possible 390150 two error cases that would create a valid code word by doing a single error correction that ends up creating a third error.