windowscudagpu

Resetting GPU and driver after CUDA error


Sometimes, bugs in my CUDA programs cause the desktop graphics to break (in Windows). Typically, the screen remains somewhat readable, but when graphics change, such as when dragging a window, lots of semi-random colored pixels and small blocks appear.

I have tried to reset the GPU and driver by changing the desktop resolution, but that doesn't help. The only fix I have found is to reboot the computer.

Is there a program out there or some trick I can use to get the driver and GPU to reset without rebooting?


Solution

  • Edit:

    If you are on Tesla hardware on Linux and can run nvidia-smi, then you can reset the GPU using

    nvidia-smi -r
    

    or

    nvidia-smi --gpu-reset
    

    Here is the man output for this switch:

    Resets GPU state. Can be used to clear double bit ECC errors or recover hung GPU. Requires -i switch to target specific device. Available on Linux only.

    Otherwise...


    The way to truly reset the hardware is to reboot.

    What you describe shouldn't happen. I recommend testing with different hardware and let us know if it still occurs.