cudaglobal-variablesidioms

Are global device-side variables in CUDA bad practice?


I am designing a library that has a large contingent of CUDA kernels to perform parallel computations. All the kernels will be acting on a common object, say a computational grid, which is defined using C++ style objects. The computational domain doesn't necessarily need to be accessed from the host side, so creating it on the device side and keeping it there makes sense for now. I'm wondering if the following is considered "good practice":

If my domain remains fixed (i.e. kernels don't modify the domain once it's created), is this OK? Is there a better way? I initially tried creating the Domain object on the host side and copying it over to the device, but this turned out to be a hassle because Domain is a relatively complex type that makes it a pain to copy over using e.g. cudaMemCpy or even Thrust::device_new (at least, I couldn't get it to work nicely).


Solution

  • Yes it's ok.

    Maybe you can improve performance using

    __constant__ 
    

    using this keyword, your object will be available in all your kernels in a very fast memory.

    In order to copy your object, you must use : cudaMemcpyToSymbol, please note there is come restriction : your object will be read-only in your device code, and it must don't have default constructor.

    You can find informations here

    If your object is complex and hard to copy, maybe you can look for : Unified memory, then just pass your variable by value to your kernel.