Crashes when run on a Windows version and CPU that supports CET (verified on Win11 23H2, i7-1365U).
Works fine on a CPU that doesn't support CET (verified on Win11 23H2, i7-10750H).
Works fine anywhere when only one of the flags is enabled (doesn't matter which).
Verified with debugger that my code uses very little stack space, so it can't be my stack that overflowed.
Happens regardless of whether I build my app in C or C++.
Affects both VS2019 and VS2022.
Create an empty C++ console project.
Set the following project properties for release build:
- C/C++ > Code Generation > Spectre Mitigation > All Loads (/QSpectre-load)
- Linker > Advanced > CET Shadow Stack Compatible > Yes (/CETCOMPAT)
Use the following code:
#include <iostream>
__declspec(noinline) void Increment(size_t& num) {
num++;
}
int main() {
std::cout << "Running...\n";
size_t num = 0;
while (true) {
Increment(num);
}
std::cout << num;
return 0;
}
Build release and run on any Windows version and Intel CPU that supports CET.
Expected behaviour: program runs continuously without returning.
Actual behaviour: crashes with EXCEPTION_STACK_OVERFLOW
It's a Visual Studio compiler issue. Until Microsoft fixes it, your options are:
Issue has been submitted to Microsoft: https://developercommunity.visualstudio.com/t/Apps-built-with-QSpectre-load-and-CETC/10949177
If we look at the assembly (annotated with source code):
num++;
00007FF6296E1000 inc qword ptr [rcx]
00007FF6296E1003 lfence
}
00007FF6296E1006 pop r11 <-------------------- ret converted to pop and jmp by /Qspectre-load
00007FF6296E1008 lfence
00007FF6296E100B jmp r11
00007FF6296E100E int 3
00007FF6296E100F int 3
int main() {
00007FF6296E1010 sub rsp,28h
std::cout << "Running...\n";
00007FF6296E1014 mov rcx,qword ptr [__imp_std::cout (07FF6296E3080h)]
00007FF6296E101B lfence
00007FF6296E101E call std::operator<<<std::char_traits<char> > (07FF6296E1040h)
size_t num = 0;
00007FF6296E1023 mov qword ptr [rsp+30h],0
00007FF6296E102C nop dword ptr [rax]
while (true) {
Increment(num);
00007FF6296E1030 lea rcx,[num]
00007FF6296E1035 call Increment (07FF6296E1000h) <------------------------ exception address
}
00007FF6296E103A jmp main+20h (07FF6296E1030h)
Exception address is as labelled above.
Value in exception parameter #2 is the address of the top of CET shadow stack.
Shadow stack is flooded with repeated entries of 00007FF6296E103A
, which is the return address of Increment()
.
Every call Increment (07FF6296E1000h)
pushes the return address onto the shadow stack but it's never popped.
Eventually the shadow stack runs out of space and the next call
crashes with EXCEPTION_STACK_OVERFLOW
when it tries to push the return address onto the shadow stack.
Address on shadow stack is supposed to be popped by a ret
instruction, but /Qspectre-load
converts ret
into a pop
and jmp
, which is what happened with Increment()
as labelled above.
Hence why the shadow stack grew until it overflowed.
This exact problem is described here: https://devblogs.microsoft.com/oldnewthing/20241015-00/?p=110374
And this technique does not play friendly with CET: The shadow stack just grows and grows because no ret instruction is ever executed.
A solution is touched on here: https://techcommunity.microsoft.com/blog/windowsosplatform/developer-guidance-for-hardware-enforced-stack-protection/2163340
techniques that manually return to a previous call frame that is not the preceding call frame will also need to be shadow stack aware. In this case, it is recommended to use the _incsspq intrinsic to pop return addresses off the shadow stack so that it is in sync with the call stack.
Documentation for /Qspectre-load
states (https://learn.microsoft.com/en-us/cpp/build/reference/qspectre-load?view=msvc-170):
Control flow instructions that load memory, including RET and CALL, are split into a load and a control flow transfer.
ret
was split into pop
and jmp
here, but somehow call
wasn't split into push
and jmp
.
If this was done then we wouldn’t have had this problem.