cudadefault-valuecuda-streams

In CUDA, is it guaranteed that the default stream equals nullptr?


In CUDA's driver_types.h we have:

typedef __device_builtin__ struct CUstream_st *cudaStream_t;

And in cuda_runtime.h we have, in many places, default-initialized stream parameters. For example:

template<class T>
    static __inline__ __host__ cudaError_t cudaLaunchKernel(
    const T *func,
    dim3 gridDim,
    dim3 blockDim,
    void **args,
    size_t sharedMem = 0,
    cudaStream_t stream = 0
)

How safe is it to assume the default stream is (cudaStream) nullptr?


Solution

  • This is documented to be the case in multiple places:

    1. Programming guide:

    Kernel launches and host <-> device memory copies that do not specify any stream parameter, or equivalently that set the stream parameter to zero, are issued to the default stream.

    1. For example, cudaMemcpyAsync:

    The copy can optionally be associated to a stream by passing a non-zero stream argument.

    It seems quite safe to assume that the default stream is equivalent to (cudaStream_t) 0

    Note that you can call out the default stream specifically with an argument other than zero, using cudaStreamLegacy (or cudaStreamPerThread) as described here. Interestingly, in CUDA 11.4, cudaStreamLegacy is a #define in driver_types.h as follows:

    #define cudaStreamLegacy ((cudaStream_t)0x1)
    

    This probably makes sense, since it is always associated with the legacy default stream, whereas a stream argument of 0 will reference the current system-defined default stream, whether that happens to be the legacy default stream or the per-thread default stream, as indicated in the previously linked blog.

    Similarly, cudaStreamPerThread is defined to be (cudaStream_t)2.