My end goal here is to provide a means to catch a floating-point exception, print a stack trace, and resume execution with floating point exceptions disabled (using the resulting non-finite/not-a-number values). I've progressed a bit since my original question, in which I realized that there's more registers that need to be adjusted to clear/configure the floating-point unit when using SSE (default in x64).
I did get a working very simple example, however things are falling apart for x64 once I go to a release build. Debug/Release build works fine against an x86 target. I've narrowed the issue down to the "Run-Time-Checks" option of Visual-Studio, specifically RTCs, which "Enables stack frame run-time error checking".
Here's the example program:
#include "stdafx.h"
#include <float.h>
#include <Windows.h>
#include <xmmintrin.h>
double zero = 0.0;
void printException(EXCEPTION_POINTERS * ExceptionInfo){
bool bFloatingPointRecoverFlag = false;
switch(ExceptionInfo->ExceptionRecord->ExceptionCode)
{
case EXCEPTION_ACCESS_VIOLATION:
fputs(" EXCEPTION_ACCESS_VIOLATION\n", stderr);
break;
case EXCEPTION_ARRAY_BOUNDS_EXCEEDED:
fputs(" EXCEPTION_ARRAY_BOUNDS_EXCEEDED\n", stderr);
break;
case EXCEPTION_BREAKPOINT:
fputs(" EXCEPTION_BREAKPOINT\n", stderr);
break;
case EXCEPTION_DATATYPE_MISALIGNMENT:
fputs(" EXCEPTION_DATATYPE_MISALIGNMENT\n", stderr);
break;
case EXCEPTION_FLT_DENORMAL_OPERAND:
fputs(" EXCEPTION_FLT_DENORMAL_OPERAND\n", stderr);
break;
case EXCEPTION_FLT_DIVIDE_BY_ZERO:
fputs(" EXCEPTION_FLT_DIVIDE_BY_ZERO\n", stderr);
break;
case EXCEPTION_FLT_INEXACT_RESULT:
fputs(" EXCEPTION_FLT_INEXACT_RESULT\n", stderr);
break;
case EXCEPTION_FLT_INVALID_OPERATION:
fputs(" EXCEPTION_FLT_INVALID_OPERATION\n", stderr);
break;
case EXCEPTION_FLT_OVERFLOW:
fputs(" EXCEPTION_FLT_OVERFLOW\n", stderr);
break;
case EXCEPTION_FLT_STACK_CHECK:
fputs(" EXCEPTION_FLT_STACK_CHECK\n", stderr);
break;
case EXCEPTION_FLT_UNDERFLOW:
fputs(" EXCEPTION_FLT_UNDERFLOW\n", stderr);
bFloatingPointRecoverFlag = 1;
break;
case EXCEPTION_ILLEGAL_INSTRUCTION:
fputs(" EXCEPTION_ILLEGAL_INSTRUCTION\n", stderr);
break;
case EXCEPTION_IN_PAGE_ERROR:
fputs(" EXCEPTION_IN_PAGE_ERROR\n", stderr);
break;
case EXCEPTION_INT_DIVIDE_BY_ZERO:
fputs(" EXCEPTION_INT_DIVIDE_BY_ZERO\n", stderr);
break;
case EXCEPTION_INT_OVERFLOW:
fputs(" EXCEPTION_INT_OVERFLOW\n", stderr);
break;
case EXCEPTION_INVALID_DISPOSITION:
fputs(" EXCEPTION_INVALID_DISPOSITION\n", stderr);
break;
case EXCEPTION_NONCONTINUABLE_EXCEPTION:
fputs(" EXCEPTION_NONCONTINUABLE_EXCEPTION\n", stderr);
break;
case EXCEPTION_PRIV_INSTRUCTION:
fputs(" EXCEPTION_PRIV_INSTRUCTION\n", stderr);
break;
case EXCEPTION_SINGLE_STEP:
fputs(" EXCEPTION_SINGLE_STEP\n", stderr);
break;
case EXCEPTION_STACK_OVERFLOW:
fputs(" EXCEPTION_STACK_OVERFLOW\n", stderr);
break;
default:
fputs(" Unrecognized Exception\n", stderr);
break;
}
}
LONG WINAPI myfunc(EXCEPTION_POINTERS * ExceptionInfo){
printf("#########Caught Ya:");
printException(ExceptionInfo);
printf("ExceptionAddr = 0x%p\n",ExceptionInfo->ExceptionRecord->ExceptionAddress);
/* clear the exception */
unsigned int stat = _clearfp();
/* disable fp exceptions*/
unsigned int ctrlwrd;
errno_t err = _controlfp_s(&ctrlwrd, _MCW_EM, _MCW_EM);
/* Disable and clear the exceptions in the exception context */
#if _WIN64
/* Get current context to get the values of MxCsr register, which was
* set by the calls to _controlfp above, we need to copy these into
* the exception context so that exceptions really stay disabled.
* References:
* https://msdn.microsoft.com/en-us/library/yxty7t75.aspx
* https://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz
*/
_CONTEXT myContext;
GetThreadContext(GetCurrentThread(),&myContext);
ExceptionInfo->ContextRecord->FltSave.ControlWord = ctrlwrd;
ExceptionInfo->ContextRecord->FltSave.StatusWord = 0;
ExceptionInfo->ContextRecord->FltSave.MxCsr = myContext.FltSave.MxCsr;
ExceptionInfo->ContextRecord->FltSave.MxCsr_Mask = myContext.FltSave.MxCsr_Mask;
ExceptionInfo->ContextRecord->MxCsr = myContext.MxCsr;
#else
ExceptionInfo->ContextRecord->FloatSave.ControlWord = ctrlwrd;
ExceptionInfo->ContextRecord->FloatSave.StatusWord = 0;
#endif
return EXCEPTION_CONTINUE_EXECUTION;
}
int _tmain(int argc, _TCHAR* argv[])
{
double a;
double b;
double c;
double d;
double e;
/* do something so that zero can't get optimized */
if(argc > 999999){
zero = (double) argc;
}
/* Enable fp exceptions */
_controlfp_s(0, 0, _MCW_EM);
/* Setup our unhandled exception filter */
SetUnhandledExceptionFilter(myfunc);
b = 5.0+zero;
/* do something bad */
a = 5.0 / zero;
c = a * b;
e = 5.0 / zero;
d = 4.0 + e;
printf("a = %f\n",a);
printf("b = %f\n",b);
printf("c = %f\n",c);
printf("d = %f\n",d);
printf("e = %f\n",e);
return 0;
}
With RTCs ENABLED, this code creates output (which is what I expect):
#########Caught Ya: EXCEPTION_FLT_DIVIDE_BY_ZERO
ExceptionAddr = 0x000000013F7A1638
a = 1.#INF00
b = 5.000000
c = 1.#INF00
d = 1.#INF00
e = 1.#INF00
With RTCs DISABLED, this code creates output:
#########Caught Ya: EXCEPTION_FLT_DIVIDE_BY_ZERO
ExceptionAddr = 0x000000013F0415F2
#########Caught Ya: EXCEPTION_ACCESS_VIOLATION
ExceptionAddr = 0x000000007711B519
#########Caught Ya: EXCEPTION_ACCESS_VIOLATION
ExceptionAddr = 0x000000007711B519
#########Caught Ya: EXCEPTION_ACCESS_VIOLATION
ExceptionAddr = 0x000000007711B519
.... repeat forever
So, in summary:
against x86 target (WIN32): No Problems in with either debug or release builds!
against x64 target (WIN64): Access violation occurs after attempting to recover from floating point exception, only when RTCs DISABLED.
Any thoughts on what RTCs does, and why it would affect the behavior of recovering from an floating point exception?
EDIT: I've debugged this a bit further in assembly. The violation occurs after returning from the filter-function, but before resuming at the divide-by-zero. Below is the assembly leading up to the access violation (the last line of assembly is the culprit):
000000007711B2EF mov dword ptr [rcx+0F0h],edi
000000007711B2F5 fxsave [rcx+100h]
000000007711B2FC movaps xmmword ptr [rcx+1A0h],xmm0
000000007711B303 movaps xmmword ptr [rcx+1B0h],xmm1
000000007711B30A movaps xmmword ptr [rcx+1C0h],xmm2
000000007711B311 movaps xmmword ptr [rcx+1D0h],xmm3
000000007711B318 movaps xmmword ptr [rcx+1E0h],xmm4
000000007711B31F movaps xmmword ptr [rcx+1F0h],xmm5
000000007711B326 movaps xmmword ptr [rcx+200h],xmm6
000000007711B32D movaps xmmword ptr [rcx+210h],xmm7
000000007711B334 movaps xmmword ptr [rcx+220h],xmm8
000000007711B33C movaps xmmword ptr [rcx+230h],xmm9
000000007711B344 movaps xmmword ptr [rcx+240h],xmm10
000000007711B34C movaps xmmword ptr [rcx+250h],xmm11
000000007711B354 movaps xmmword ptr [rcx+260h],xmm12
000000007711B35C movaps xmmword ptr [rcx+270h],xmm13
000000007711B364 movaps xmmword ptr [rcx+280h],xmm14
000000007711B36C movaps xmmword ptr [rcx+290h],xmm15
000000007711B374 stmxcsr dword ptr [rcx+34h]
000000007711B378 mov rax,qword ptr [rsp+8]
000000007711B37D mov qword ptr [rcx+0F8h],rax
000000007711B384 mov eax,dword ptr [rsp]
000000007711B387 mov dword ptr [rcx+44h],eax
000000007711B38A mov dword ptr [rcx+30h],10000Fh
000000007711B391 add rsp,8
000000007711B395 ret
000000007711B396 int 3
000000007711B397 int 3
000000007711B398 int 3
000000007711B399 int 3
000000007711B39A int 3
000000007711B39B int 3
000000007711B39C nop dword ptr [rax]
000000007711B39F push rbp
000000007711B3A0 push rsi
000000007711B3A1 push rdi
000000007711B3A2 sub rsp,30h
000000007711B3A6 mov rbp,rsp
000000007711B3A9 test rdx,rdx
000000007711B3AC je 000000007711B4E4
000000007711B3B2 cmp dword ptr [rdx],80000029h
000000007711B3B8 jne 000000007711B3C4
000000007711B3BA cmp dword ptr [rdx+18h],1
000000007711B3BE jae 000000007711B634
000000007711B3C4 cmp dword ptr [rdx],80000026h
000000007711B3CA jne 000000007711B4E4
000000007711B3D0 mov rax,qword ptr [rdx+20h]
000000007711B3D4 mov r8,qword ptr [rax+8]
000000007711B3D8 mov qword ptr [rcx+90h],r8
000000007711B3DF mov r8,qword ptr [rax+10h]
000000007711B3E3 mov qword ptr [rcx+98h],r8
000000007711B3EA mov r8,qword ptr [rax+18h]
000000007711B3EE mov qword ptr [rcx+0A0h],r8
000000007711B3F5 mov r8,qword ptr [rax+20h]
000000007711B3F9 mov qword ptr [rcx+0A8h],r8
000000007711B400 mov r8,qword ptr [rax+28h]
000000007711B404 mov qword ptr [rcx+0B0h],r8
000000007711B40B mov r8,qword ptr [rax+30h]
000000007711B40F mov qword ptr [rcx+0D8h],r8
000000007711B416 mov r8,qword ptr [rax+38h]
000000007711B41A mov qword ptr [rcx+0E0h],r8
000000007711B421 mov r8,qword ptr [rax+40h]
000000007711B425 mov qword ptr [rcx+0E8h],r8
000000007711B42C mov r8,qword ptr [rax+48h]
000000007711B430 mov qword ptr [rcx+0F0h],r8
000000007711B437 mov r8,qword ptr [rax+50h]
000000007711B43B mov qword ptr [rcx+0F8h],r8
000000007711B442 mov r8d,dword ptr [rax+58h]
000000007711B446 mov dword ptr [rcx+34h],r8d
000000007711B44A mov dword ptr [rcx+118h],r8d
000000007711B451 mov r8w,word ptr [rax+5Ch]
000000007711B456 mov word ptr [rcx+100h],r8w
000000007711B45E movaps xmm0,xmmword ptr [rax+60h]
000000007711B462 movaps xmmword ptr [rcx+200h],xmm0
000000007711B469 movaps xmm0,xmmword ptr [rax+70h]
000000007711B46D movaps xmmword ptr [rcx+210h],xmm0
000000007711B474 movaps xmm0,xmmword ptr [rax+80h]
000000007711B47B movaps xmmword ptr [rcx+220h],xmm0
000000007711B482 movaps xmm0,xmmword ptr [rax+90h]
000000007711B489 movaps xmmword ptr [rcx+230h],xmm0
000000007711B490 movaps xmm0,xmmword ptr [rax+0A0h]
000000007711B497 movaps xmmword ptr [rcx+240h],xmm0
000000007711B49E movaps xmm0,xmmword ptr [rax+0B0h]
000000007711B4A5 movaps xmmword ptr [rcx+250h],xmm0
000000007711B4AC movaps xmm0,xmmword ptr [rax+0C0h]
000000007711B4B3 movaps xmmword ptr [rcx+260h],xmm0
000000007711B4BA movaps xmm0,xmmword ptr [rax+0D0h]
000000007711B4C1 movaps xmmword ptr [rcx+270h],xmm0
000000007711B4C8 movaps xmm0,xmmword ptr [rax+0E0h]
000000007711B4CF movaps xmmword ptr [rcx+280h],xmm0
000000007711B4D6 movaps xmm0,xmmword ptr [rax+0F0h]
000000007711B4DD movaps xmmword ptr [rcx+290h],xmm0
000000007711B4E4 mov eax,dword ptr [rcx+30h]
000000007711B4E7 and eax,100040h
000000007711B4EC cmp eax,100040h
000000007711B4F1 jne 000000007711B519
000000007711B4F3 mov r8d,dword ptr [rcx+34h]
000000007711B4F7 movsxd rax,dword ptr [rcx+4E0h]
000000007711B4FE lea rbx,[rcx+2D0h]
000000007711B505 add rbx,rax
000000007711B508 xchg r8d,dword ptr [rbx+18h]
000000007711B50C mov eax,0FFFFFFFCh
000000007711B511 cdq
000000007711B512 xrstor [rbx]
000000007711B515 mov dword ptr [rbx+18h],r8d
000000007711B519 fxrstor [rcx+100h]
The value in the debugger of "rcx" is 0x30e340, and the exception message in visual studio reads: "First-chance exception at 0x7711b519 in fptest.exe: 0xC0000005: Access violation reading location 0xffffffffffffffff."
Why would VS report that it is attempting to read 0xffffffffffffffff?
Finally got the answer to my own question. I had been using a line GetThreadContext(GetCurrentThread(),&myContext)
to capture the current context, which had my desired values for floating point registers after calling _clearfp and _controlfp. However I failed to notice in the help for GetThreadContext: If you call GetThreadContext for the current thread, the function returns successfully; however, the context returned is not valid.
Turns out the right way to get the current thread context is RtlCaptureContext. I'll edit my original code to reflect.