i've recently been trying out PyCuda.
I currently want to do somthing very simple, allocate some memory. Im assuming i have some fundamental misunderstanding because this is quite a simple task. My understanding is that with the code below i am create a 2d Cuda array 512 wide, 160 high and an elementsize of 1 byte.
Heres some test code below.
import pycuda.driver as cuda
import pycuda.autoinit
# Alloc some gpu memory
test_pitch = cuda.mem_alloc_pitch(512,160,1)
When i try running this code i get the following error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pycuda._driver.LogicError: cuMemAllocPitch failed: invalid argument
If anyone has any insights as to what im doing wrong that would be greatly appreciated.
Quoting from the CUDA driver API documentation
cuMemAllocPitch ( CUdeviceptr* dptr,
size_t* pPitch,
size_t WidthInBytes,
size_t Height,
unsigned int ElementSizeBytes )
The function may pad the allocation to ensure that corresponding pointers in any given row will continue to meet the alignment requirements for coalescing as the address is updated from row to row. ElementSizeBytes specifies the size of the largest reads and writes that will be performed on the memory range. ElementSizeBytes may be 4, 8 or 16 (since coalesced memory transactions are not possible on other data sizes)
In this case the first two arguments are the return values of mem_alloc_pitch
, and ElementSizeBytes
is access_size
in the PyCUDA call.
You have:
cuda.mem_alloc_pitch(512,160,1)
i.e. your access_size
is 1, which is illegal. Only 4, 8, or 16 are legal. Thus the error.