I'm trying to compile a simple example from GitHub/cuda_samples on Windows PowerShell1:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:44:19_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0
When I try to compile the example, I get the following error:
$ git clone https://github.com/NVIDIA/cuda-samples.git
$ cd .\cuda-samples\Samples\0_Introduction\simpleAssert\
$ nvcc -I..\..\..\Common\ .\simpleAssert.cu
simpleAssert.cu
nvcc error : 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION)
This error is reported in Nvidia developers' forum, but left unresolved there.
1 I have an Nvidia GeForce RTX 3050 ( 4GB ) Laptop GPU
I'm adding an answer based on Stefan's for more details. I took Stefan's advice and looked at the command line created in a new CUDA VS project. Here it is in its entirety (very long !)
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin\nvcc.exe" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -cudart static -g -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/Fdx64\Debug\vc143.pdb" -o C:\Users\tuna_\source\repos\CudaRuntime2\x64\Debug\kernel.cu.obj "C:\Users\tuna_\source\repos\CudaRuntime2\kernel.cu
I played around removing flags that seemed irrelevant until I came up with the ccbin
flag1
$ nvcc -ccbin C:\'Program Files'\'Microsoft Visual Studio'\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64 -I..\..\..\Common\ .\simpleAssert.cu
simpleAssert.cu
tmpxft_0000a6b8_00000000-10_simpleAssert.cudafe1.cpp
Creating library a.lib and object a.exp
Compilation is fine ( a.exe
generated ) and running it seems fine2 too:
$ .\a.exe
simpleAssert starting...
GPU Device 0: "Ampere" with compute capability 8.6
Launch kernel to generate assertion failures
-- Begin assert output
C:\Users\tuna_\GitHub\cuda-samples\Samples\0_Introduction\simpleAssert\simpleAssert.cu:63: block: [1,0,0], thread: [28,0,0] Assertion `gtid < N` failed.
C:\Users\tuna_\GitHub\cuda-samples\Samples\0_Introduction\simpleAssert\simpleAssert.cu:63: block: [1,0,0], thread: [29,0,0] Assertion `gtid < N` failed.
C:\Users\tuna_\GitHub\cuda-samples\Samples\0_Introduction\simpleAssert\simpleAssert.cu:63: block: [1,0,0], thread: [30,0,0] Assertion `gtid < N` failed.
C:\Users\tuna_\GitHub\cuda-samples\Samples\0_Introduction\simpleAssert\simpleAssert.cu:63: block: [1,0,0], thread: [31,0,0] Assertion `gtid < N` failed.
-- End assert output
Device assert failed as expected, CUDA error message is: device-side assert triggered
simpleAssert completed, returned OK
1 which resonates well with Stefan's answer about x64
vs. x86
stuff
2 the only thing that bothers me a bit is that the power shell is stuck after each execution