qr-codenumericdecodingbitstring

How do I decode a bit string representing a n-digit number, where the 'n' digits were grouped and encoded in bit words of different lengths?


Background

I am trying to decode a bitstring from a QR-Code, where the data was encoded in numeric mode. According to this QR-Code tutorial: https://www.thonky.com/qr-code-tutorial/numeric-mode-encoding (which references the standard), the numeric encoding shall be as follows:

Split the n-digit number into 3-digit groups and encode each group into a

If n is not a multiple of three, the last group will be 1- or 2-digits long The rules above also apply for this last group.

Encoding Example:

Take the 14-digit number: 12300101234567

Split it into 3-digit groups and convert then to binary numbers:

Therefore the 14-digit number is encoded into the following 38 bits: 00011110110001000110001010110011000011

Decoding (what I have so far):

The QR-Code gives me the number of digits that were encoded, so taking the exams above I know n = 14.

Following the encoding rules calculate:

thus there are

Therefore the last bit word is 7 bits long. 1000011 and encodes the number 67

The remaining 31 bits encode the other 12 digits. 0001111011000100011000101011001

How do 4-, 7- and 10-bit words fit into 31 bit?

Try combinations:

Therefore the bitstring is built from the following bit words.

Problem:

I don't know the order of the bit words.

There are two more restrictions that result from the encoding rules:

This can help exclude several orders when iterating over all possible orders. But it does not exclude all possibilities. I am not able to get the correct order of the bit words.

I appreciate your help, thanks.


Solution

  • After locating a copy of the actual ISO/IEC 18004:2015 QR code standard, I have found that the part about leading zeros is not in the actual standard.

    A 3-digit group is encoded in 10 bits, regardless of how many leading zeros it has. Sources claiming otherwise are wrong.

    The standard even uses an example with a leading zero: 01234567 is broken up into 012 345 67, and the 012 is encoded as 0000001100, not as 0001100.