As a beginner to Assembly, I've been practicing disassembling and reverse engineering on Intel x86 assembly in IDA.
The current program I'm trying to figure out validates the user given password by forming it's own "validation password" and comparing the two. If they match, the user given password is accepted.
The validation password is formed by a loop that runs 16 times and the characters for the password come from a label named CHARACTERS
which stores the address of the string aAbcdefghijklmn
.
aAbcdefghijklmn
is defined as 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890'
var_40
is defined as -40
hexadecimal.
mov [ebp+loop_counter], 0
loc_8049201:
cmp [ebp+loop_counter], 0Fh ; Compare Two Operands
jge loc_804922F ; Jump if Greater or Equal (SF=OF)
mov eax, CHARACTERS
mov ecx, [ebp+loop_counter]
mov ecx, [ebp+ecx*4+var_40]
mov dl, [eax+ecx]
mov eax, [ebp+loop_counter]
mov [ebp+eax+validation_password], dl
mov eax, [ebp+loop_counter]
add eax, 1 ; Add
mov [ebp+loop_counter], eax
jmp loc_8049201 ; Jump
loc_804922F:
lea eax, [ebp+validation_password] ; Load Effective Address
mov ecx, [ebp+user_password]
mov edx, esp
mov [edx+4], ecx
mov [edx], eax
call _strcmp ; Call Procedure
cmp eax, 0 ; Compare Two Operands
jnz loc_804925D ; Jump if Not Zero (ZF=0)
This portion creates the validation password.
mov eax, CHARACTERS
mov ecx, [ebp+loop_counter]
mov ecx, [ebp+ecx*4+var_40]
mov dl, [eax+ecx]
mov eax, [ebp+loop_counter]
mov [ebp+eax+validation_password], dl
mov eax, [ebp+loop_counter]
add eax, 1 ; Add
mov [ebp+loop_counter], eax
jmp loc_8049201 ; Jump
What I cannot for the life of me figure out is how the index for the 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890'
string for each loop is calculated.
What I understand is that the character chosen is the address of CHARACTERS
stored in eax
and the value of [ebp+ecx*4+var_40]
as the offset, creating the specific index. This is then stored into dl
.
I do not know how to determine the index number using the memory address calculation [ebp+ecx*4+var_40]
in each loop.
EDIT:
Initialization of the array at ebp+var_40
is done with _memcpy earlier in the same function.
push ebp
mov ebp, esp
push esi
sub esp, 64h
mov eax, [ebp+user_password]
xor ecx, ecx
lea edx, unk_804A064 ; db 3
lea esi, [ebp+var_40]
mov [esp], esi
mov [esp+4], edx
mov dword ptr [esp+8], 3Ch
mov [ebp+var_58], eax
mov [ebp+var_5C], ecx
call _memcpy
unk_804A064
is defined as db 3
. Starting from unk_804A064
the 60 following bytes are:
3, 0, 0, 0, 34h, 0, 0, 0, 38h, 0, 0, 0, 1Ah, 0, 0, 0, 2Ch, 0, 0, 0, 2Ch, 0, 0, 0, 1Eh, 0, 0, 0, 26h, 0, 0, 0, 1Bh, 0, 0, 0, 25h, 0, 0, 0, 32h, 0, 0, 0, 13h, 0, 0, 0, 37h, 0, 0, 0, 2Ch, 0, 0, 0, 0Ah, 0, 0, 0
It's a byte gather operation, using int
indices from another array on the stack at ebp+var_40
. You haven't shown how that array is initialized.
mov ecx, [ebp+loop_counter]
loads ECX with the loop counter. (An optimized build would just keep that in a register the whole time; this debug build produces a lot of extra instructions to wade through, making it harder to reverse-engineer, but also simpler because you know each block of asm corresponds to a C statement, and there are named local variables on the stack).
mov ecx, [ebp+ecx*4+var_40]
loads ECX with an int
from a stack array, replacing the previous use of ECX. Like int tmp = indices[i];
where int indices[n];
is in automatic storage (on the stack).
mov dl, [eax+ecx]
uses that as an index into your alphabet string (EAX was earlier loaded from CHARACTERS
.) So this is char c = (*CHARACTERS)[tmp];
The next two instructions, mov eax, [ebp+loop_counter]
/
mov [ebp+eax+validation_password], dl
are validation_password[i] = c;