I am trying to solve a slightly modified Bomb Lab problem for my Computer Architecture class. I'm supposed to write the C equivalent for the functions, but got stuck in Phase 5. It's very similar to this question and I have indeed figured out what the function does for the most part.
105b: 56 push %esi
105c: 53 push %ebx
105d: 83 ec 10 sub $0x10,%esp
1060: e8 6b fa ff ff call ad0 <__x86.get_pc_thunk.bx>
1065: 81 c3 fb 3e 00 00 add $0x3efb,%ebx
106b: 8b 74 24 1c mov 0x1c(%esp),%esi
106f: 56 push %esi
1070: e8 bf 02 00 00 call 1334 <string_length>
1075: 83 c4 10 add $0x10,%esp
1078: 83 f8 06 cmp $0x6,%eax
107b: 75 2e jne 10ab <phase_5+0x50>
107d: 89 f0 mov %esi,%eax
107f: 83 c6 06 add $0x6,%esi
1082: b9 00 00 00 00 mov $0x0,%ecx
1087: 0f b6 10 movzbl (%eax),%edx
108a: 83 e2 0f and $0xf,%edx
108d: 03 8c 93 00 da ff ff add -0x2600(%ebx,%edx,4),%ecx
1094: 83 c0 01 add $0x1,%eax
1097: 39 f0 cmp %esi,%eax
1099: 75 ec jne 1087 <phase_5+0x2c>
109b: 83 f9 34 cmp $0x34,%ecx
109e: 74 05 je 10a5 <phase_5+0x4a>
10a0: e8 38 05 00 00 call 15dd <explode_bomb>
10a5: 83 c4 04 add $0x4,%esp
10a8: 5b pop %ebx
10a9: 5e pop %esi
10aa: c3 ret
10ab: e8 2d 05 00 00 call 15dd <explode_bomb>
10b0: eb cb jmp 107d <phase_5+0x22>
It's a function that accepts a string of 6 characters (bomb explodes if it doesn't) and does some form of looping algorithm that produces a number. At the end, if the result of the loop does not equal 52 (0x34), the bomb explodes once again. However, I couldn't understand a certain part of the code:
108d: 03 8c 93 00 da ff ff add -0x2600(%ebx,%edx,4),%ecx
Apparently, it offsets the number you've obtained by masking the ASCII equivalent of each character in the string by some unknown algorithm. For now, I've made a table of offsets for each character and managed to get an accepted string, aaaabb
, but I would like to know what the C equivalent of the code looks like.
Just like in Jester's answer, it's indexing an array. ecx += table[edx]
, for static int table[];
The EDX index is scaled by 4 in the addressing mode because sizeof(int)
is 4; asm needs byte offsets, C indexing uses element offsets.
-0x2600 + %ebx
is a static array, same as 0x804a4a0
in the linked question. But it's harder to find in static disassembly because whoever created this executable annoyingly compiled it as 32-bit PIE (position-independent executable).
32-bit PIC / PIE sucks because PC-relative addressing was new with x86-64, so this is needlessly more complicated to reverse engineer.
It gets the GOT (Global Offset Table) address into EBX: First call __x86.get_pc_thunk.bx
returns its return address in EBX, i.e. 0x1065
with the placeholder addresses you got from objdump -d
. Then add $0x3efb,%ebx
adds the offset from that location to the GOT.
Then static data is addressed relative to the GOT base (in EBX in this case). Pain the ass to follow that vs. an absolute address. You could just single-step in a debugger in a running process, after the kernel's program-loader maps the code to some virtual address (other than 0x1000
).
Or do it manually: 0x1065 + 0x3efb
= GOT base (EBX) of 0x4f60
.
0x4f60-0x2600
is the lookup table array start: 0x2960
(if you were using objdump -D
to dump the data sections as well). You might be able to use that address in GDB (before start
or run
) with an x
command to dump the table in a convenient format other than fake disassembly of data as code from objdump.
The actual address in a running process will be that plus some multiple of 4096 (0x1000).