ccudacufft

Unable to allocate CUDA device memory for cufftComplex data type


I am trying to allocate a cufftComplex array into memory on a CUDA device (GEFORCE GTX 1080) using the following code:

cufftComplex *d_in, *d_out;
int ds = sizeof(cufftComplex) * width * height;
CUResult test_din = cuMemAlloc((void**)&d_in, ds);
CUResult test_dout = cuMemAlloc((void**)&d_out, ds);
printf("test_din:  %s\n", cudaGetErrorString(test_din));
printf("test_dout:  %s\n", cudaGetErrorString(test_dout));

When I run this code the error that I get is:

test_din: initialization error

test_dout: initialization error

When I compile the code I do get a warning about using void** but all the examples of cufft that I've seen, including the code samples that come with Cuda 9.1, include the void** type cast. The warning is worded as follows:

/usr/local/cuda/include/cuda.h:90:49: note: expected 'CUdeviceptr *' but argument is of type 'void **'

Is there something obvious that I am doing wrong here?


Solution

  • cuMemAlloc is from the CUDA driver API.

    If you study any proper driver API programs, you will find that one of the first things you need to do is to issue:

    cuInit();
    

    to start using CUDA. Perhaps you have not done that (you are supposed to provide a MCVE). That is a likely reason for this particular error.

    You will run into other disconnects between the CUDA driver API and CUDA runtime API if you intermix the two. It should not be necessary for most codes, and I don't recommend it for beginners.

    Study sample codes to learn how to use one or the other. For example, study the vectorAdd sample code to learn the basics of a CUDA runtime API program. Study the corresponding vectorAddDrv to learn the basics of a CUDA driver API program.

    The easiest fix here is probably just to replace your calls to cuMemAlloc with cudaMalloc:

    cufftComplex *d_in, *d_out;
    int ds = sizeof(cufftComplex) * width * height;
    cudaError_t test_din = cudaMalloc((void**)&d_in, ds);
    cudaError_t test_dout = cudaMalloc((void**)&d_out, ds);
    printf("test_din:  %s\n", cudaGetErrorString(test_din));
    printf("test_dout:  %s\n", cudaGetErrorString(test_dout));