I have a sample code that illustrates the issue:
import numpy as np
from numba import cuda, types
import configs
def main():
arr = np.empty(0, dtype=np.uint8)
stream = cuda.stream()
d_arr = cuda.to_device(arr, stream=stream)
kernel[configs.BLOCK_COUNT, configs.THREAD_COUNT, stream](d_arr)
@cuda.jit(types.void(
types.Array(types.uint8, 1, 'C'),
), debug=configs.CUDA_DEBUG)
def kernel(d_arr):
arr = cuda.const.array_like(d_arr)
if __name__ == "__main__":
main()
When I run this code with cuda-memcheck, I get:
numba.errors.ConstantInferenceError: Failed in nopython mode pipeline (step: nopython rewrites)
Constant inference not possible for: arg(0, name=d_arr)
Which seems to indicate that array I passed in was not a constant so it could not be copied to constant memory - is that the case? If so, how can I copy to constant memory an array that was given to a kernel as input?
You don't copy to constant array using an array that was given to the kernel as input. That type of input array is already in the device, and device code cannot write to constant memory.
Constant memory can only be written to from host code, and the constant syntax expects the array to be a host array.
Here is an example:
$ cat t32.py
import numpy as np
from numba import cuda, types, int32, int64
a = np.ones(3,dtype=np.int32)
@cuda.jit
def generate_mutants(b):
c_a = cuda.const.array_like(a)
b[0] = c_a[0]
if __name__ == "__main__":
b = np.zeros(3,dtype=np.int32)
generate_mutants[1, 1](b)
print(b)
$ python t32.py
[1 0 0]
$
Note that the implementation of constant memory in Numba CUDA has some behavioral differences compared to what is possible with CUDA C/C++, this issue highlights some of them.