[SOLVED] PyCuda - use *.cubin - named symbol not found

PyCuda - use *.cubin - named symbol not found

I try to use a compiled *.cubin file with PyCuda but I get this error:

func = mod.get_function("doublify")
pycuda._driver.LogicError: cuModuleGetFunction failed: named symbol not found

Content of doublify.cu:

    __global__ void doublify(float *a)
    {
        int idx = threadIdx.x + threadIdx.y * 4;
        a[idx] *= 2;
    }

I compiled it with the following command:

nvcc --cubin -arch sm_75 doublify.cu

This is my python script:

    import pycuda.driver as cuda
    import pycuda.autoinit
    from pycuda.compiler import SourceModule
    import numpy

    a = numpy.random.randn(4, 4)
    a = a.astype(numpy.float32)
    a_gpu = cuda.mem_alloc(a.nbytes)

    mod = pycuda.driver.module_from_file("doublify.cubin")

    func = mod.get_function("doublify")
    func(a_gpu, block=(4,4,1))

    cuda.memcpy_dtoh(a_doubled, a_gpu)

    print(a)

Do I need to pass in additional flags to the nvcc compiler? If I use it with the SourceModule from Pycuda everything is working as expected. It's also not working with compiling a *.fatbin

Solution

Figure it out myself after debugging PyCuda itself. If anyone else stumbles upon the same problem, this is the solution: I was missing the extern "C" statement at the beginning of the *.cu file.

extern "C"
__global__ void doublify(float *a)
{
        int idx = threadIdx.x + threadIdx.y * 4;
        a[idx] *= 2;
}