I have a course (reverse engineering) in uni and i have a homework. I got a .obj file (which was compiled with visual studio 2008), and i have to disassemble it, figure out the control structure and call it in a little c program.
I used IDA decompiler, here is the asm code:
_FB3:
00000000: 55 push ebp
00000001: 56 push esi
00000002: 57 push edi
00000003: 8B 7C 24 10 mov edi,dword ptr [esp+10h]
00000007: 83 3F 00 cmp dword ptr [edi],0
0000000A: 74 79 je 00000085
0000000C: 8D 64 24 00 lea esp,[esp]
00000010: 8B 2F mov ebp,dword ptr [edi]
00000012: 8B 75 00 mov esi,dword ptr [ebp]
00000015: 8B 44 24 14 mov eax,dword ptr [esp+14h]
00000019: 8B CE mov ecx,esi
0000001B: EB 03 jmp 00000020
0000001D: 8D 49 00 lea ecx,[ecx]
00000020: 8A 10 mov dl,byte ptr [eax]
00000022: 3A 11 cmp dl,byte ptr [ecx]
00000024: 75 1A jne 00000040
00000026: 84 D2 test dl,dl
00000028: 74 12 je 0000003C
0000002A: 8A 50 01 mov dl,byte ptr [eax+1]
0000002D: 3A 51 01 cmp dl,byte ptr [ecx+1]
00000030: 75 0E jne 00000040
00000032: 83 C0 02 add eax,2
00000035: 83 C1 02 add ecx,2
00000038: 84 D2 test dl,dl
0000003A: 75 E4 jne 00000020
0000003C: 33 C0 xor eax,eax
0000003E: EB 05 jmp 00000045
00000040: 1B C0 sbb eax,eax
00000042: 83 D8 FF sbb eax,0FFFFFFFFh
00000045: 85 C0 test eax,eax
00000047: 7D 05 jge 0000004E
00000049: 8D 7D 0C lea edi,[ebp+0Ch]
0000004C: EB 32 jmp 00000080
0000004E: 8B 44 24 14 mov eax,dword ptr [esp+14h]
00000052: 8B CE mov ecx,esi
00000054: 8A 10 mov dl,byte ptr [eax]
00000056: 3A 11 cmp dl,byte ptr [ecx]
00000058: 75 1A jne 00000074
0000005A: 84 D2 test dl,dl
0000005C: 74 12 je 00000070
0000005E: 8A 50 01 mov dl,byte ptr [eax+1]
00000061: 3A 51 01 cmp dl,byte ptr [ecx+1]
00000064: 75 0E jne 00000074
00000066: 83 C0 02 add eax,2
00000069: 83 C1 02 add ecx,2
0000006C: 84 D2 test dl,dl
0000006E: 75 E4 jne 00000054
00000070: 33 C0 xor eax,eax
00000072: EB 05 jmp 00000079
00000074: 1B C0 sbb eax,eax
00000076: 83 D8 FF sbb eax,0FFFFFFFFh
00000079: 85 C0 test eax,eax
0000007B: 7E 1E jle 0000009B
0000007D: 8D 7D 08 lea edi,[ebp+8]
00000080: 83 3F 00 cmp dword ptr [edi],0
00000083: 75 8B jne 00000010
00000085: 6A 10 push 10h
00000087: E8 00 00 00 00 call _malloc
0000008C: 83 C4 04 add esp,4
0000008F: 89 07 mov dword ptr [edi],eax
00000091: 85 C0 test eax,eax
00000093: 75 14 jne 000000A9
00000095: 5F pop edi
00000096: 5E pop esi
00000097: 33 C0 xor eax,eax
00000099: 5D pop ebp
0000009A: C3 ret
0000009B: 8B C5 mov eax,ebp
0000009D: FF 40 04 inc dword ptr [eax+4]
000000A0: 5F pop edi
000000A1: 5E pop esi
000000A2: B8 01 00 00 00 mov eax,1
000000A7: 5D pop ebp
000000A8: C3 ret
000000A9: 8B 74 24 14 mov esi,dword ptr [esp+14h]
000000AD: 8B C6 mov eax,esi
000000AF: 8D 50 01 lea edx,[eax+1]
000000B2: 8A 08 mov cl,byte ptr [eax]
000000B4: 40 inc eax
000000B5: 84 C9 test cl,cl
000000B7: 75 F9 jne 000000B2
000000B9: 2B C2 sub eax,edx
000000BB: 40 inc eax
000000BC: 50 push eax
000000BD: E8 00 00 00 00 call _malloc
000000C2: 8B 0F mov ecx,dword ptr [edi]
000000C4: 89 01 mov dword ptr [ecx],eax
000000C6: 8B 07 mov eax,dword ptr [edi]
000000C8: 83 C4 04 add esp,4
000000CB: 83 38 00 cmp dword ptr [eax],0
000000CE: 74 C5 je 00000095
000000D0: 8B 10 mov edx,dword ptr [eax]
000000D2: 8B CE mov ecx,esi
000000D4: 8A 01 mov al,byte ptr [ecx]
000000D6: 88 02 mov byte ptr [edx],al
000000D8: 41 inc ecx
000000D9: 42 inc edx
000000DA: 84 C0 test al,al
000000DC: 75 F6 jne 000000D4
000000DE: 8B 17 mov edx,dword ptr [edi]
000000E0: C7 42 04 01 00 00 mov dword ptr [edx+4],1
00
000000E7: 8B 07 mov eax,dword ptr [edi]
000000E9: C7 40 08 00 00 00 mov dword ptr [eax+8],0
00
000000F0: 8B 0F mov ecx,dword ptr [edi]
000000F2: 5F pop edi
000000F3: 5E pop esi
000000F4: C7 41 0C 00 00 00 mov dword ptr [ecx+0Ch],0
00
000000FB: B8 01 00 00 00 mov eax,1
00000100: 5D pop ebp
00000101: C3 ret
IDA made me a nice control structure as well:
As you can see the code is something like this:
for(...)
{
for1(...){...}
...
for1(...){...}
}
malloc
....
for3() ...
malloc
...
for2(...)
{
...
}
As i know the for1 and for2 has nearly the same structure, only the activity is different, and the for3's implemented function is in the functionfamily as for1 and for2. The for3 uses the result of second malloc as a parameter, so i think that the for2 should be some kind of array copy loop. The for1, for2 and for3 are known stdc inline implementations.
Can someone help me how to figure out this f3 function's purpose?
The second question: how can i use this .obj file in a little sample C program? How can i call its function in VS?
Thanks in advance, any help is appreciated.
UPDATE: Jester: interesting. How did you know about the node's structure? I'm still trying to figure out this whole thing (with your help), but nothing yet.
I figured out, IDA disassembler have a pseudocode viewing feature. here is the pseudo:
signed int __cdecl FB3(int a1, const char *a2)
{
int v2; // edi@1
const char **v3; // ebp@2
void *v4; // eax@7
signed int result; // eax@8
int v6; // edx@11
const char *v7; // ecx@11
const char v8; // al@12
v2 = a1;
while ( *(_DWORD *)v2 )
{
v3 = *(const char ***)v2;
if ( strcmp(a2, **(const char ***)v2) >= 0 )
{
if ( strcmp(a2, **(const char ***)v2) <= 0 )
{
++v3[1];
return 1;
}
v2 = (int)(v3 + 2);
}
else
{
v2 = (int)(v3 + 3);
}
}
v4 = malloc(0x10u);
*(_DWORD *)v2 = v4;
if ( v4 && (**(_DWORD **)v2 = malloc(strlen(a2) + 1)) != 0 )
{
v6 = **(_DWORD **)v2;
v7 = a2;
do
{
v8 = *v7;
*(_BYTE *)v6++ = *v7++;
}
while ( v8 );
*(_DWORD *)(*(_DWORD *)v2 + 4) = 1;
*(_DWORD *)(*(_DWORD *)v2 + 8) = 0;
*(_DWORD *)(*(_DWORD *)v2 + 12) = 0;
result = 1;
}
else
{
result = 0;
}
return result;
}
From this maybe it counts a number's occurence in a string? This pseudocode is a little misty for me.
I've tried to call this function in a sample program, but with no success. I used: extern signed int fb3(int a1, const char *a2); then i tried to call it, but the linker gives me "unresolved external symbol _fb3 referenced in function _main" error (so, there is no fb3 function with this signature in the .obj file which i declared with that extern keyword i guess. so the signature is wrong).
Here is the sample program (main.c) i've tried to use:
#include <stdio.h>
extern signed int fb3(int a1, const char *a2);
int main(void)
{
char b[3] = {'e','3','y'};
signed int i = fb3(3,b);
printf("%d",i);
return 0;
}
I've set the linker input (vs2010) to f3.obj as well.
UPDATE2: I implemented the node struct, and used a case sensitive function name, now i can compile successfully.
The sample program:
#include <stdio.h>
typedef struct node
{
int count;
const char * text;
struct node* right;
struct node* left;
} node;
extern int FB3(node* root, const char *text);
int main(void)
{
node* root;
signed int i;
int j;
root = (node*)malloc(sizeof(node));
root->count = 0;
root->text = "textone";
root->right = NULL;
root->left = NULL;
printf("value = %d\n", FB3(root,"v"));
printf("value = %d\n", FB3(root,"b"));
printf("value = %d\n", FB3(root,"c"));
printf("value = %d\n", FB3(root,"3dasf"));
printf("value = %d\n", FB3(root,"3ssdfs"));
printf("value = %d\n", FB3(root,"dsda"));
printf("value = %d\n", FB3(root,"v"));
printf("value = %d\n", FB3(root,"gsda"));
printf("value = %d\n", FB3(root,"gsda"));
printf("value = %d\n", FB3(root,"a"));
printf("value = %d\n", FB3(root,"ab"));
return 0;
}
The output is:
value=1
value=1
value=1
... (only value=1)
The interesting thing is that the 7th printf should printf "value=2", because the "v" is already in the tree, no?
From a quick glance this seems to be a binary tree used to count string occurrences. Tree node looks like:
const char* text;
int count;
node* left;
node* right;
The function itself is int addstring(node** root, const char* text)
First the code checks if the tree is empty and skips the search if it is.
Search starts at 0x10, by doing if (strcmp(current->text, text) > 0) current = current->right;
and looping back. This code doesn't look optimized, at 0x4E it does the same comparison, this time checking for < 0
and goes left. At 0x9B is the "found" branch, it increments the counter and returns 1.
If the text is not found a new node is created at 0x85, inserted into the tree and the text is copied into it using strdup
(implemented as malloc(strlen())
+ strcpy
). Both left
and right
of the new node are set to NULL
and the count
to 1.
Update: The node size is 16 bytes as can be seen from the malloc
invocation. Offset 0 is used to compare the text, so that must be the text. Offset 4 is incremented, so that must be the counter. Offset 8 and 12 are the two child pointers because they are used as such.
The prototype IDA has come up with is nonsense, the first argument must be a pointer otherwise it will blow up. Also, C is case sensitive so try FB3
(in capitals).
Something like this:
#include <stdio.h>
extern int FB3(void** root, const char *text);
int main(void)
{
void* root = NULL;
int i = FB3(&root, "e3y");
printf("%p %d", root, i);
return 0;
}
If that works, you can go ahead and add the node struct so you can then traverse and print the tree from C.