memorycudaglobal-variables

Do I need to to mark a global variable used in a kernel as __device__?


I am using a global variable say d_myVar, which will be allocated device memory using cudaMalloc() in main function. I am not clear, should I use __ device__ in front of it while doing global declaration? I ask this, since if it were local variable in host and was passed to a kernel, we would not write __device__ in front of it. Let me know if I am wrong.


Solution

  • Globally-scoped __device__ variables are not allocated with cudaMalloc. Simply annotate a variable in the global scope with __device__:

     #include <stdio.h>
    
     __device__ int d_myVar;
    
     __global__ void foo()
     {
       printf("d_myVar is %d\n", d_myVar);
     }
    
     int main()
     {
       int h_myVar = 13;
       cudaMemcpyToSymbol(d_myVar, &h_myVar, sizeof(int), 0, cudaMemcpyHostToDevice);
       foo<<<1,1>>>();
       cudaThreadSynchronize();
       return 0;
     }
    

    The result:

    $ nvcc -arch=sm_20 test.cu -run
    d_myVar is 13