parallel-processinggpuopencl

How to compile OpenCL kernel into bitstream?


How to compile OpenCL kernel into bitstream which I can later load directly without recompile? My platform is AMD machine with both APU and AMD's discrete GPU. The machine is running the latest AMD APP SDK which supports OpenCL 1.2.


Solution

  • 1) compile the kernel to a program from source with the clCreateProgamWithSource API call. Compiler errors are retrieved with clGetProgramBuildInfo API call.

    2) use the clGetProgramInfo API call to get the CL_PROGRAM_BINARY_SIZES. These are the sizes of the program binaries. 2a) Allocate memory for the binaries using the sizes from 2)

    3) use the clGetProgramInfo API call to get the CL_PROGRAM_BINARIES. This gets the program binary.

    4) A binary can be turned into an OpenCL program object with the API call clCreateProgamWithBinary.

    Binaries a device specific so a binary compiled on a specific device will not run on a different device.

    For a single process instance, once you have the environment ( platform, device, context and queue ) you can just re-use the OpenCL Kernel object and re-execute that with another clEnqueueNDRange API call.