assemblyx86characterbitwise-operators

Assembly: Character type checking


So I'm practicing disassembling on an C program that checks if the user given serial number fulfills the requirements.

The program manually checks each character from the serial number, executing the jump if the desired character type is found. Included here is the first check. Although there are more, the only thing what changes is the character being examined and the AND key.

var_8= dword ptr -8
var_1= byte ptr -1
user_serialNumber= dword ptr  8

push    ebp
mov     ebp, esp
sub     esp, 18h        ; Integer Subtraction
mov     eax, [ebp+user_serialNumber]
mov     ecx, [ebp+user_serialNumber]
mov     edx, esp
mov     [edx], ecx
mov     [ebp+var_8], eax
call    _strlen         ; Call Procedure
cmp     eax, 19         ; Compare Two Operands
jz      loc_804920A     ; Jump if Zero (ZF=1)

loc_804920A:
call    ___ctype_b_loc  ; Call Procedure
mov     eax, [eax]
mov     ecx, [ebp+user_serialNumber]
movsx   ecx, byte ptr [ecx] ; Move with Sign-Extend
movzx   eax, word ptr [eax+ecx*2] ; Move with Zero-Extend
and     eax, 2048       ; Logical AND
cmp     eax, 0          ; Compare Two Operands
jnz     loc_8049232     ; Jump if Not Zero (ZF=0)

The entire function is way too large and basically repeats the same thing so I'll lay out the AND key values for the rest of the checks. Included here is also the serial number D</sI0D!/)$_Jw3kaa\ I've tried.

Char AND Key SN Char
0 2048 D
1 1024 <
2 1024 /
3 2048 s
4 2048 I
5 2048 0
6 2048 D
7 1024 !
8 1024 /
9 1024 )
10 1024 $
11 1024 _
12 2048 J
13 2048 w
14 2048 3
15 2048 k
16 2048 a
17 2048 a
18 1024 \

This is the table I've been using to check which set bit matches whichever character type.

enum
{
  _ISupper = _ISbit (0),        /* UPPERCASE.  */
  _ISlower = _ISbit (1),        /* lowercase.  */
  _ISalpha = _ISbit (2),        /* Alphabetic.  */
  _ISdigit = _ISbit (3),        /* Numeric.  */
  _ISxdigit = _ISbit (4),       /* Hexadecimal numeric.  */
  _ISspace = _ISbit (5),        /* Whitespace.  */
  _ISprint = _ISbit (6),        /* Printing.  */
  _ISgraph = _ISbit (7),        /* Graphical.  */
  _ISblank = _ISbit (8),        /* Blank (usually SPC and TAB).  */
  _IScntrl = _ISbit (9),        /* Control character.  */
  _ISpunct = _ISbit (10),       /* Punctuation.  */
  _ISalnum = _ISbit (11)        /* Alphanumeric.  */
};

So as far as I understand, 2048 corresponds to _ISalnum (12th set bit) and 1024 corresponds to _ISpunct (11th set bit). The program then checks for the presence of these set bits using AND, zeroing other set bits and leaving the set bit for _ISpunct or _ISalnum depending on the key. If this set bit is found, the cmp eax, 0 instruction is not zero and the jnz loc_8049232 jump executes which is what we want in order to progress to the next character.

This code is from a separate function and it returns 1 to the main when every check succeeds.

My problem here is that this serial number, among others I've come up with, is not accepted by the program and I am having a hard time understanding why as each character seems to match the desired character type.

EDIT:

The code after the last successful test

loc_8049499:
mov     [ebp+var_1], 1


loc_804949D:
mov     al, [ebp+var_1]
and     al, 1           ; Logical AND
movzx   eax, al         ; Move with Zero-Extend
add     esp, 18h        ; Add
pop     ebp
retn                    ; Return Near from Procedure
check_serial endp

When a check fails the program jumps to loc_804949D with [ebp+var_1] set to 0


Solution

  • I suspect you are using gnu libc headers. Be extra careful, because the definition of _ISbit depends on endianness:

    # if __BYTE_ORDER == __BIG_ENDIAN
    #  define _ISbit(bit)   (1 << (bit))
    # else /* __BYTE_ORDER == __LITTLE_ENDIAN */
    #  define _ISbit(bit)   ((bit) < 8 ? ((1 << (bit)) << 8) : ((1 << (bit)) >> 8))
    # endif
    

    Since x86 is little endian, the value 2048 is actually going to be _ISdigit and 1024 is _ISalpha. Thus the first character has to be a digit which D is not. Its value code is indeed 0xd508 and means _ISupper | _ISalpha | _ISxdigit | _ISprint | _ISgraph | _ISalnum