I am working on a QT application for which I've integrated DirectX 11 into a custom widget. The application renders a scrolling display - a graphical representation of data being read from a file. The user can speed up and slow down the scrolling speed.
For the most part, this is working great. The DirectX 11 rendering is presented to my custom widget just as I'd expect. The problem is that the graphics driver randomly hangs and crashes my program. I say "random" because I have been testing this with the same data file and it never seems to crash at the same point in the file, after a specific amount of time, or at a specific scrolling speed (the faster the scrolling speed, the more work being done by the GPU per frame).
When the application hangs, my screen freezes for a moment, goes black, then returns with a nice message from NVidia that it has recovered from a driver crash. The Debug Output in Visual Studio contains the following:
D3D11: Removing Device.
D3D11 ERROR: ID3D11Device::RemoveDevice: Device removal has been triggered for the following reason (DXGI_ERROR_DEVICE_HUNG: The Device took an unreasonable amount of time to execute its commands, or the hardware crashed/hung. As a result, the TDR (Timeout Detection and Recovery) mechanism has been triggered. The current Device Context was executing commands when the hang occurred. The application may want to respawn and fallback to less aggressive use of the display hardware). [ EXECUTION ERROR #378: DEVICE_REMOVAL_PROCESS_AT_FAULT]
I have discovered that by simply commenting out the IDXGISwapChain1::Present call, the application will run through the file at blazing speed. Graphics-wise it is still pushing data to the GPU and drawing to render targets, it just never gets displayed to my window.
What I'm hoping for is help with ideas of what types of things cause driver hangs. My shaders are incredibly simple - basically just positioning my vertices using a projection matrix. And considering what I described in the above paragraph, shaders should still be cranking through vertices and pixels even when Present isn't being called, yes?
I was suspicious that this could be a compatibility issue with Qt - I know DirectX isn't officially supported by Qt. So I tried creating a separate window using CreateWindowEx and using that for my swap chain instead of the custom Qt widget. It rendered to that window but also hung the driver just like before.
I was also suspicious of a driver bug in my laptop, so I tried running the application on a beefier desktop PC that regularly runs another DirectX 11 application (non-Qt) without a hitch (worth mentioning that this other application renders similar data in a scrolling display as well, using shaders that are a lot more complex). But my QT application hangs the driver on that PC as well.
Anyone know of a way I can get a more detailed description of what is causing the driver to hang?
Thank you in advance for any help you can offer.
UPDATE: 2013-08-01 17:16 CST I am currently investigating a possible thread syncing issue that may be the culprit. Will continue tomorrow morning and post if I solve this on my own.
After some testing today, it appears to have been a threading issue. I have run several times today with no graphics crash. So my problem must be fixed, unless I've just been getting lucky with my tests today (or unlucky, rather - if this shows its ugly face again in a day or two).
I was aware that the immediate device context is not thread safe. Rather than using deferred contexts, though, I was using critical sections to sync my threads and coordinate use of the device context. What I did not realize is that it is not safe to call IDXGISwapChain1::Present while another thread is using the device context. Makes sense, but since it is not call directly from the device context itself, I overlooked it. I literally moved my Present() call a few lines up into my critical section block, and it hasn't given me a crash since.