c++cudac++17global-variablesconstexpr

Sharing constants between CPU and GPU in CUDA


I'd like to share some constants between CPU and GPU in order to allow for execution of the same code (wrapped in between) on either. That includes some compile-time parameters which are most reasonably written as arrays and I'd like them to be constexpr such that the compiler can (ideally) elide the arrays during compilation.

When I try

#include <stdio.h>
using fe_ftype = double;
__device__  constexpr fe_ftype vars[2] = {100.0, 300.0};
//__constant__ constexpr fe_ftype vars[2] = {100.0, 300.0};
const fe_ftype l = 3.0;

__global__ void foo() {
    printf("%lf %lf %lf\n", vars[0], vars[1], l);
}

int main(void) {
    foo<<<1,1>>>();
    cudaDeviceSynchronize();
    printf("%lf %lf %lf\n", vars[0], vars[1], l);
}

(see https://godbolt.org/z/19bYj34f8)

It seems to work in that compilation succeeds and I get the same output from both the GPU and CPU side, with the constants either as constant read-only data for CPU or being moved into the registers directly on the GPU side. However, the compiler complains with warning #20091-D: a __device__ variable "vars" cannot be directly read in a host function. If it were not a constant, I would be accessing some random uninitialized memory on the host, but with the constexpr annotation it seems to successfully grok it on the host side. It also seems to work identically if I use __constant__ instead of __device__.

Can I rely on this behaviour? Does the warning generally not apply to constexpr variables? Extra condition: I can only use C++17, so no consteval.


Solution

  • Can I rely on this behaviour?

    No, you cannot. NVIDIA forbids access to __device__ variables in host-side code. In fact, I have an open bug about (non-constexpr) __device__ variables being magically accessible in __host__ __device_ functions (bug 5307292; but you probably can't access it since the bugs are not public); and NVIDIA's current reply is that it's difficult for them to detect this happening, not that it's fine the way it is.

    Does the warning generally not apply to constexpr variables?

    It does. You see, constexpr variables still have addresses. And it possible to take their address and use them. So, in principle, you would have your host-side code reading directly from an address in device-global memory; that's not supposed to work. Compiler optimization might "make it happen", but unless NVIDIA guarantees that's what happens, it is basically a fluke.

    So, how can you share constants?

    Two possibilities to consider: