pythonnumpycudacufft

Converting a NumPy Array to cufftComplex


I am writing a script to perform an FFT using the GPU/CUDA based cuFFT library. CuFFT requires that input data must be in the format specified as "cufftComplex". However my input data is in the numpy.complex64 format. I am using the Python C-API to send data from python to C. How can I convert between the two formats? Currently my code looks like this:

#include<python2.7/Python.h>
#include<numpy/arrayobject.h>
#include<cufft.h>


void compute_BP(PyObject* inputData, pyObject* OutputData, int Nfft)
{
   cuffthandle plan;
   cuFFTPlan1d(&plan, Nfft, CUFFT_C2C, CUFFT_INVERSE);
   cuFFTExecC2C(plan, inputData, OutputData, CUFFT_INVERSE);
   ...
 }

When compiling I get the following error:

Error: argument of type "PyObject *" is incompatible with parameter of type "cufftComplex".


Solution

  • borrowing from my answer here, here is a worked example of how you could use ctypes in python to run a function from the cufft library in a python script, using numpy data:

    $ cat mylib.cpp
    #include <cufft.h>
    #include <stdio.h>
    #include <assert.h>
    #include <cuda_runtime_api.h>
    extern "C"
    void fft(void *input, void *output, size_t N){
    
      cufftHandle plan;
      cufftComplex *d_in, *d_out;
      size_t ds = N*sizeof(cufftComplex);
      cudaMalloc((void **)&d_in,  ds);
      cudaMalloc((void **)&d_out, ds);
      cufftResult res = cufftPlan1d(&plan, N, CUFFT_C2C, 1);
      assert(res == CUFFT_SUCCESS);
      cudaMemcpy(d_in, input, ds, cudaMemcpyHostToDevice);
      res = cufftExecC2C(plan, d_in, d_out, CUFFT_FORWARD);
      assert(res == CUFFT_SUCCESS);
      cudaMemcpy(output, d_out, ds, cudaMemcpyDeviceToHost);
      printf("%s\n", cudaGetErrorString(cudaGetLastError()));
      printf("from shared object:\n");
      for (int i = 0; i < N; i++)
        printf("%.1f + j%.1f, ", ((cufftComplex *)output)[i].x, ((cufftComplex *)output)[i].y);
      printf("\n");
    }
    
    $ cat t8.py
    import ctypes
    import os
    import sys
    import numpy as np
    
    mylib = ctypes.cdll.LoadLibrary('libmylib.so')
    
    N = 4
    mydata = np.ones((N), dtype = np.complex64)
    myresult = np.zeros((N), dtype = np.complex64)
    mylib.fft(ctypes.c_void_p(mydata.ctypes.data), ctypes.c_void_p(myresult.ctypes.data), ctypes.c_size_t(N))
    print(myresult)
    
    $ g++ -fPIC -I/usr/local/cuda/include --shared mylib.cpp -L/usr/local/cuda/lib64 -lcufft -lcudart -o libmylib.so
    $ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` python t8.py
    no error
    from shared object:
    4.0 + j0.0, 0.0 + j0.0, 0.0 + j0.0, 0.0 + j0.0,
    [4.+0.j 0.+0.j 0.+0.j 0.+0.j]
    $