I'm reading a book that shows how buffer overflow attacks and one technique to stop it is called Stack Randomization, below is quoted from the book:
a persistent attacker can overcome randomization by brute force, repeatedly attempting attacks with different addresses. A common trick is to include a long sequence of nop (pronounced “no op,” short for “no operation”) instructions before the actual exploit code. Executing this instruction has no effect, other than incrementing the program counter to the next instruction. As long as the attacker can guess an address somewhere within this sequence, the program will run through the sequence and then hit the exploit code. If we set up a 256-byte(28) nop sled, then the randomization over n = 223 can be cracked by enumerating 215 = 32,768 starting addresses
I understand the first part, but don't get the second part about enumerating starting addresses. For example, %rsp
points to the current start address as picture below shows (only show 8 bytes instead of 256 bytes for simplicity)
I think what the author mean is, try and guess different address to the stack memory where the %rsp
is pointing to. And the padding between return address and %rsp
are all nop
, and then overwrite the return address with the guessed address which is highly likely points to part of padding(nop
). But since Stack Randomization allocats a random amount of space between 0 and n bytes on the stack at the start of a program, so we can only say the probability of successful attack is 215/223 = 0.78%, how can we say try 32,768(a fixed number) addresses then it will be cracked? it is the same as toss a coin, you can only say the probablity of getiing a head is 0.5, you cannot say you will get a head in the second round, becuase you might get two tails
Stack Randomization allocats a random amount of space between 0 and n bytes on the stack at the start of a program
No, it doesn't allocate. It randomizes where in virtual address space the stack is mapped, leaving other pages unmapped. This is with page granularity.
Guessing wrong will typically result in the victim program segfaulting (or whatever the OS does on an invalid page fault). This is noisy and obvious to any intrusion-detection. And if the program does eventually restart so you can try again, its stack address will be re-randomized, as you suggest.
Wrong guesses that land in valid memory but not your NOP sled will also typically crash soon, on an invalid instruction or something that decodes to an invalid memory access.
So yes, you're right, you can't just enumerate the address space, it's only a probabilistic thing. Trying enough times will mean it's likely you succeed, not guaranteed. And trying to enumerate the possible ASLR entropy isn't particularly useful, or any more likely to succeed sooner than guessing the same page every time I think.
But trying different offsets within a single page is useful because OS-driven stack ASLR only has page granularity.
There is some amount of stack space used by env vars and command line args which will vary from system to system, but tend to be constant across invocations of the same program, I think. Unless the vulnerable function is reached from different call chains, or there's a variable-sized stack allocation in a parent function, in which case the buffer's offset within page could vary every run.
Although of course most processes these days run with non-executable stacks, making direct code-injection impossible for stack buffers. A ROP attack (injecting a return address to an existing sequence of bytes that decode to something interesting) is possible if there are any known static data or code addresses. If not, you have to deal with ASLR of code/data, but you don't get to inject NOP sleds.