In CUDA's driver_types.h
we have:
typedef __device_builtin__ struct CUstream_st *cudaStream_t;
And in cuda_runtime.h
we have, in many places, default-initialized stream parameters. For example:
template<class T>
static __inline__ __host__ cudaError_t cudaLaunchKernel(
const T *func,
dim3 gridDim,
dim3 blockDim,
void **args,
size_t sharedMem = 0,
cudaStream_t stream = 0
)
How safe is it to assume the default stream is (cudaStream) nullptr
?
This is documented to be the case in multiple places:
Kernel launches and host <-> device memory copies that do not specify any stream parameter, or equivalently that set the stream parameter to zero, are issued to the default stream.
cudaMemcpyAsync
:The copy can optionally be associated to a stream by passing a non-zero stream argument.
It seems quite safe to assume that the default stream is equivalent to (cudaStream_t) 0
Note that you can call out the default stream specifically with an argument other than zero, using cudaStreamLegacy
(or cudaStreamPerThread
) as described here. Interestingly, in CUDA 11.4, cudaStreamLegacy
is a #define
in driver_types.h as follows:
#define cudaStreamLegacy ((cudaStream_t)0x1)
This probably makes sense, since it is always associated with the legacy default stream, whereas a stream argument of 0 will reference the current system-defined default stream, whether that happens to be the legacy default stream or the per-thread default stream, as indicated in the previously linked blog.
Similarly, cudaStreamPerThread
is defined to be (cudaStream_t)2
.