I have a very strange code, which as far as I understand, replaces the return address of the function b, and thus the function f is called from it. But I do not quite understand why after the function f has run, execution returns to the function main and from there b is called again. P.s.: The code only works on a 32-bit system
#include <iostream>
int f() {
std::cout << "Hello";
return 2;
}
int b() {
int *m[1];
m[3] = (int *)&f;
return 1;
}
int main() {
return b();
}
I tried to go through the assembler, but it didn't give any special results.
Assembly main:
int main() {
004724E0 push ebp
004724E1 mov ebp,esp
004724E3 sub esp,0C0h
004724E9 push ebx
004724EA push esi
004724EB push edi
004724EC mov edi,ebp
004724EE xor ecx,ecx
004724F0 mov eax,0CCCCCCCCh
004724F5 rep stos dword ptr es:[edi]
004724F7 mov ecx,offset _666773A0_main@cpp (047E068h)
004724FC call @__CheckForDebuggerJustMyCode@4 (0471389h)
00472501 nop
return b();
00472502 call b (0471456h)
}
00472507 pop edi
00472508 pop esi
00472509 pop ebx
0047250A add esp,0C0h
00472510 cmp ebp,esp
00472512 call __RTC_CheckEsp (0471294h)
00472517 mov esp,ebp
00472519 pop ebp
0047251A ret
Assembly b:
int b() {
004722A0 push ebp
004722A1 mov ebp,esp
004722A3 sub esp,0CCh
004722A9 push ebx
004722AA push esi
004722AB push edi
004722AC lea edi,[ebp-0Ch]
004722AF mov ecx,3
004722B4 mov eax,0CCCCCCCCh
004722B9 rep stos dword ptr es:[edi]
004722BB mov ecx,offset _666773A0_main@cpp (047E068h)
004722C0 call @__CheckForDebuggerJustMyCode@4 (0471389h)
004722C5 nop
int *m[1];
m[3] = (int *)&f;
004722C6 mov eax,4
004722CB imul ecx,eax,3
004722CE mov dword ptr m[ecx],offset f (0471172h)
return 1;
004722D6 mov eax,1
}
004722DB push edx
004722DC mov ecx,ebp
004722DE push eax
004722DF lea edx,ds:[472300h]
004722E5 call @_RTC_CheckStackVars@8 (047122Bh)
004722EA pop eax
004722EB pop edx
004722EC pop edi
004722ED pop esi
004722EE pop ebx
004722EF add esp,0CCh
004722F5 cmp ebp,esp
004722F7 call __RTC_CheckEsp (0471294h)
004722FC mov esp,ebp
004722FE pop ebp
004722FF ret
00472300 add dword ptr [eax],eax
00472302 add byte ptr [eax],al
00472304 or byte ptr [ebx],ah
00472306 inc edi
00472307 add al,bh
00472309 ?? ??????
0047230A ?? ??????
Assembly f:
int f() {
00472400 push ebp
00472401 mov ebp,esp
00472403 sub esp,0C0h
00472409 push ebx
0047240A push esi
0047240B push edi
0047240C mov edi,ebp
0047240E xor ecx,ecx
00472410 mov eax,0CCCCCCCCh
00472415 rep stos dword ptr es:[edi]
00472417 mov ecx,offset _666773A0_main@cpp (047E068h)
0047241C call @__CheckForDebuggerJustMyCode@4 (0471389h)
00472421 nop
std::cout << "Hello";
00472422 push offset string "Hello" (0479B30h)
00472427 mov eax,dword ptr [__imp_std::cout (047D0C8h)]
0047242C push eax
0047242D call std::operator<<<std::char_traits<char> > (04711A9h)
00472432 add esp,8
return 2;
00472435 mov eax,2
}
0047243A pop edi
0047243B pop esi
0047243C pop ebx
0047243D add esp,0C0h
00472443 cmp ebp,esp
00472445 call __RTC_CheckEsp (0471294h)
0047244A mov esp,ebp
0047244C pop ebp
0047244D ret
If I need to provide more code I can
Update after disassembly was posted: Jester commented with the answer:
Once
f
tries to return, it will pop off the next item from the stack and jump there. Looking at the assembly code; that will be the result of thepush edi
at004724EB
in main.No telling what edi contains at that point, but it sounds like you got lucky and it was a valid code address that eventually happened to invoke b again.
Like I said below, normally we'd expect it to just crash since it's not common to have another valid code address on the stack right above a return address. But in this case we do, probably since main
's caller has to deal with addresses and probably has them in registers, and MSVC debug-mode's main
uses EDI for rep stosd
to poison some stack memory so it has to save/restore it.
What compiler with what options, targeting what ISA?
(Updated re: your edit: that's 32-bit x86. From mov eax,0CCCCCCCCh
/ rep stosd
, that's MSVC in a debug build, poisoning stack memory so uninitialized variables have a recognizable bit-pattern. And MSVC pads between functions with int3
instructions, not NOPs like GCC/Clang use, so fall-through between functions is less plausible unless one happened to be a multiple of 16 machine-code bytes.)
This code has undefined behaviour that's visible at compile time, so there's no guarantee the compiler compiled it to asm that actually stores to an array on the stack and returns.
For some kinds of UB, some compilers (notably Clang and sometimes GCC) assume that path of execution is unreachable and stop emitting instructions for it, including omitting the ret
at the end of a function so execution falls into whatever comes next in the binary. Especially with optimization enabled, although this UB is visible even without optimization.
If it does compile as-written, you're overwriting something on the stack with a function address. If the thing at m[3]
is the function's return address, then you'll return to f
instead of the call-site that originally set the return address.
So things are already weird, and I'd normally expect it to crash when f
tries to return. IDK why there'd be another valid return address right above the one b
popped to get to f
.
The code only works on a 32-bit system
32-bit x86 I assume? There are other 32-bit ISAs, including ARM and RISC-V being widely available on hobby boards.
On x86-64 at least, the stack pointer will be misaligned on entry to f
: RSP % 16 == 0
before a call is required in x86-64 calling conventions, so RSP % 16 == 8
on function entry after the call
instruction pushes an 8-byte return address. On many other ISAs; call
/ bl
/ jal
just puts the return address in a register (the "link register"), so stack alignment is the same on function entry as at a call-site within another function.
And cout<<
functions might well do something that relies on 16-byte RSP alignment, like using movaps
to copy 16 bytes to and/or from a stack variable. Glibc printf
and scanf
do.
But of course, x86-64 uses 8-byte pointers and int
is only 32-bit, so you're only overwriting part of the return address at best, even with a different offset into m[]
. This ROP demo could still work in a Linux non-PIE executable (gcc -fno-pie -no-pie -fno-stack-protector
) since static addresses (including code) will be in the low 31 bits of virtual address-space.