pythoncudanvidiapycuda

How to use PyCuda mem_alloc_pitch()


i've recently been trying out PyCuda.

I currently want to do somthing very simple, allocate some memory. Im assuming i have some fundamental misunderstanding because this is quite a simple task. My understanding is that with the code below i am create a 2d Cuda array 512 wide, 160 high and an elementsize of 1 byte.

Heres some test code below.

import pycuda.driver as cuda
import pycuda.autoinit
# Alloc some gpu memory
test_pitch = cuda.mem_alloc_pitch(512,160,1)

When i try running this code i get the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
pycuda._driver.LogicError: cuMemAllocPitch failed: invalid argument

If anyone has any insights as to what im doing wrong that would be greatly appreciated.


Solution

  • Quoting from the CUDA driver API documentation

    cuMemAllocPitch ( CUdeviceptr* dptr, 
                      size_t* pPitch, 
                      size_t WidthInBytes, 
                      size_t Height, 
                      unsigned int  ElementSizeBytes )
    

    The function may pad the allocation to ensure that corresponding pointers in any given row will continue to meet the alignment requirements for coalescing as the address is updated from row to row. ElementSizeBytes specifies the size of the largest reads and writes that will be performed on the memory range. ElementSizeBytes may be 4, 8 or 16 (since coalesced memory transactions are not possible on other data sizes)

    In this case the first two arguments are the return values of mem_alloc_pitch, and ElementSizeBytes is access_size in the PyCUDA call.

    You have:

    cuda.mem_alloc_pitch(512,160,1)
    

    i.e. your access_size is 1, which is illegal. Only 4, 8, or 16 are legal. Thus the error.