pythoncudaindex-errorpycuda

index-error: "invalid subindex in axis 0" with pycuda


import math # all the libraries i import
import numpy as np
!pip install pycuda
import pycuda.gpuarray as gpu
import pycuda.cumath as cm

import pycuda.autoinit
import pycuda.driver as drv
from pycuda.compiler import SourceModule

I have an error that gets thrown after using PyCUDA GPUarrays with a for loop. I defined a function PropagatorS that uses a for loop and it ran fine when I had just been using numpy, but not after switching to cuda.

def PropagatorS(N, L, area, z0):

  p = gpu.zeros((N,N), dtype = 'complex_')

  for ii in gpu.arange(0, N, 1):
      for jj in gpu.arange(0, N, 1):
          u = (ii - N/2 - 1)/area
          v = (jj - N/2 - 1)/area
          p[ii, jj] = cm.exp(1j*np.pi*L*z0*(u**2 + v**2))
  return p

Trying with some values:

p = PropagatorS(200, 700*10**-9, 0.002, 0.08))

returns "IndexError: invalid subindex in axis 0". The error happens in this line:

--->    p[ii, jj] = cm.exp(1j*np.pi*L*z0*(u**2 + v**2))

I am using Colab to run this code. I can't find any troubleshooting threads on this, hopefully someone can help. :)


Solution

  • This is not how you use cumath.

    cumath functions like exp take an array argument, and perform the work on that array. There is no need for the doubly-nested for-loops.

    so:

    math.exp takes an argument and raises e to the power of that argument.

    cumath.exp takes an input array, and returns an array of the same shape, where each element of the returned array is e raised to the power of the corresponding element in the input array.

    Here is a trivial example:

    $ cat t31.py
    import math # all the libraries i import
    import numpy as np
    import pycuda.gpuarray as gpu
    import pycuda.cumath as cm
    
    import pycuda.autoinit
    import pycuda.driver as drv
    from pycuda.compiler import SourceModule
    
    def PropagatorS(N):
    
      q = gpu.zeros((N,N), dtype = np.float32)
      p = gpu.ones_like(q)
      cm.exp(p, out=q)
      return q
    
    
    p = PropagatorS(4)
    print(p)
    $ LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64 python t31.py
    [[ 2.71828175  2.71828175  2.71828175  2.71828175]
     [ 2.71828175  2.71828175  2.71828175  2.71828175]
     [ 2.71828175  2.71828175  2.71828175  2.71828175]
     [ 2.71828175  2.71828175  2.71828175  2.71828175]]
    $
    

    I think to do what you want you have at least a couple options:

    1. create an array with your desired exponents in numpy. Transfer that numpy array to a GPU array. Then call cumath.exp on that GPU array.

    2. write a pycuda kernel to do it.

    Here is one possible example of how to do it using method 1 i.e. cumath.exp:

    $ cat t32.py
    import math # all the libraries i import
    import numpy as np
    import pycuda.gpuarray as gpu
    import pycuda.cumath as cm
    
    import pycuda.autoinit
    import pycuda.driver as drv
    from pycuda.compiler import SourceModule
    
    def PropagatorS(N, L, area, z0):
    
      p = np.zeros((N,N), dtype = np.complex64)
    
      for ii in range(0, N, 1):
          for jj in range(0, N, 1):
              u = (ii - N/2 - 1)/area
              v = (jj - N/2 - 1)/area
              p[ii, jj] = 1j*np.pi*L*z0*(u**2 + v**2)
      q = gpu.to_gpu(p)
      r = cm.exp(q)
      return r
    
    
    p = PropagatorS(4, 700*10**-9, 0.002, 0.08)
    print(p)
    $ LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64 python t32.py
    [[ 0.70264995+0.71153569j  0.84094459+0.54112124j  0.90482706+0.42577928j
       0.92267275+0.385584j  ]
     [ 0.84094459+0.54112124j  0.93873388+0.34464294j  0.97591674+0.21814324j
       0.98456436+0.17502306j]
     [ 0.90482706+0.42577928j  0.97591674+0.21814324j  0.99613363+0.0878512j
       0.99903291+0.04396812j]
     [ 0.92267275+0.385584j    0.98456436+0.17502306j  0.99903291+0.04396812j
       1.00000000+0.j        ]]