
Identifying the CRC8 Checksum

I have been trying to reverse-engineering an old device that uses an 8 bits uC . I have access to a couple of messages that the uC sends to the computer and the result of their respective checksums. I also found the code of the uC but is an incomplete version so it doesn't provide more info than this:

Here are some messages, the structure is: STX+ DATA+ EOT+checksum+EXT

Here is common implementation of the CRC algorithm:

byte CRC8(byte param_1,byte *checksum)

  byte bVar1;
  char cVar2= '\b';
  byte bStack_3 = param_1;

  do {
    bVar1 = (bStack_3 ^ *checksum) << 7;
    if ((char)bVar1 < '\0') {
      *checksum = *checksum ^ 0x18;
    *checksum = *checksum >> 1 | bVar1;
    bStack_3 = bStack_3 >> 1 | bStack_3 << 7;
    cVar2 = cVar2 + -1;
  } while (cVar2 != '\0');
  return param_1;


  • I had two approaches to find the CRC algorithm.

    1. Re-engineering the Binary

    I found that several interrupt entries are set to jumps. This is quite common. Among them is the interrupt service routine for the serial interface. Looking deeper I found they are using circular buffers for reception and transmission.

    Visiting the usage locations of the variables I found the base sending routine. Following the chain of references I finally found the interesting function I called sendAndUpdateCrc(). This is a cleaned version:

    void sendAndUpdateCrc(char c)

    This core function is called all over the program, I get back to it later. The most important part is the function I called updateCrc().

    This is the cleaned version of the CRC update function. It uses a global variable to accumulate.

    void updateCrc(char c) {
      char cVar2 = 8 /* '\b' */;
      byte bStack_3 = c;
      do {
        byte bVar1 = (bStack_3 ^ crc) << 7;
        if ((char)bVar1 < 0 /*'\0' */) { /* checks the sign bit that was LSBit */
          crc = crc ^ 0xXX;
        crc = crc >> 1 | bVar1;
        bStack_3 = bStack_3 >> 1 | bStack_3 << 7;
        cVar2 = cVar2 + -1;
      } while (cVar2 != 0 /* '\0' */);

    With this function the second approach was much easier, see below.

    Visiting the calling locations of sendAndUpdateCrc() I came across these specialized functions.

    The first two characters of the messages you observed as are fetched from the addresses in internal RAM. Their values are calculated in another function, but I refrained from wading through its code to try to understand it. It seems to be called during startup of the application. Perhaps these two characters are set by the PC.

    During re-engineering I found two different styles of code. One is quite optimal, while the other is really poor and therefore slow. However, no human would write such code, so I assume a very simple compiler as the producer. The good code is most probably written in assembly by a human.

    Example of good code:

            PUSH    BANK0_R0
    LAB_CODE_00e0:                              ; wait for space in the buffer
            CLR     EA
            MOV     R0,txdBufferSize
            CJNE    R0,#0x2,LAB_CODE_00eb
            SETB    EA
            SJMP    LAB_CODE_00e0
            MOV     R0,txdBufferWritePointer    ; store character in buffer
            MOV     @R0,A
            INC     R0
            CJNE    R0,#txdBufferEnd,LAB_CODE_00f4
            MOV     R0,#txdBufferBegin
            MOV     txdBufferWritePointer,R0    ; and advance buffer pointer
            INC     txdBufferSize               ; increment used size
            MOV     R0,txdBufferEmpty           ; check empty flag
            CJNE    R0,#0x1,LAB_CODE_0102
            MOV     txdBufferEmpty,#0x0
            SETB    TI                          ; start transmission after an idle phase
            SETB    EA
            POP     BANK0_R0

    Example of poor code (remember, the 8051 uses a stack growing towards high addresses):

            PUSH    A           ; save c on the stack
            MOV     A,SP
            ADD     A,#0x0
            MOV     R0,A
            MOV     A,@R0       ; load c from the stack top
            LCALL   fputc       ; call fputc()
            MOV     A,SP
            ADD     A,#0x0
            MOV     R0,A
            MOV     A,@R0       ; load c from the stack top
            LCALL   updateCrc   ; call updateCrc()
            DEC     SP          ; rewind stack

    As a last "goodie" I found this erroneous code:

            CJNE    A,#0xa,LAB_CODE_00da    ; 0xA = \n
            LCALL   send
            LCALL   send

    The function should apparently transmit the single input character '\n' as the common sequence CR-LF ('\r', '\n'). But instead of CR it repeats LF. Outch.

    2. Investigating the Known Telegrams

    Since we have just 8 bits as a checksum, a brute-force approach is easily feasible. For this I used the enhanced CRC function and looked for the initial values necessary to produce the observed CRC. The following is the little program I wrote:

    #include <stdio.h>
    #include <stdint.h>
    #include <string.h>
    static uint8_t crc;
    static void updateCrc(char c)
      for (int b = 0; b < 8; ++b) {
        if (((c ^ crc) & 1) != 0) {
          crc ^= 0xXX;
          crc >>= 1;
          crc |= 0x80;
        } else {
          crc >>= 1;
        c >>= 1;