import math # all the libraries i import
import numpy as np
!pip install pycuda
import pycuda.gpuarray as gpu
import pycuda.cumath as cm
import pycuda.autoinit
import pycuda.driver as drv
from pycuda.compiler import SourceModule
I have an error that gets thrown after using PyCUDA GPUarrays with a for loop. I defined a function PropagatorS that uses a for loop and it ran fine when I had just been using numpy, but not after switching to cuda.
def PropagatorS(N, L, area, z0):
p = gpu.zeros((N,N), dtype = 'complex_')
for ii in gpu.arange(0, N, 1):
for jj in gpu.arange(0, N, 1):
u = (ii - N/2 - 1)/area
v = (jj - N/2 - 1)/area
p[ii, jj] = cm.exp(1j*np.pi*L*z0*(u**2 + v**2))
return p
Trying with some values:
p = PropagatorS(200, 700*10**-9, 0.002, 0.08))
returns "IndexError: invalid subindex in axis 0". The error happens in this line:
---> p[ii, jj] = cm.exp(1j*np.pi*L*z0*(u**2 + v**2))
I am using Colab to run this code. I can't find any troubleshooting threads on this, hopefully someone can help. :)
This is not how you use cumath
.
cumath
functions like exp take an array argument, and perform the work on that array. There is no need for the doubly-nested for-loops.
so:
math.exp
takes an argument and raises e
to the power of that argument.
cumath.exp
takes an input array, and returns an array of the same shape, where each element of the returned array is e
raised to the power of the corresponding element in the input array.
Here is a trivial example:
$ cat t31.py
import math # all the libraries i import
import numpy as np
import pycuda.gpuarray as gpu
import pycuda.cumath as cm
import pycuda.autoinit
import pycuda.driver as drv
from pycuda.compiler import SourceModule
def PropagatorS(N):
q = gpu.zeros((N,N), dtype = np.float32)
p = gpu.ones_like(q)
cm.exp(p, out=q)
return q
p = PropagatorS(4)
print(p)
$ LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64 python t31.py
[[ 2.71828175 2.71828175 2.71828175 2.71828175]
[ 2.71828175 2.71828175 2.71828175 2.71828175]
[ 2.71828175 2.71828175 2.71828175 2.71828175]
[ 2.71828175 2.71828175 2.71828175 2.71828175]]
$
I think to do what you want you have at least a couple options:
create an array with your desired exponents in numpy. Transfer that numpy array to a GPU array. Then call cumath.exp
on that GPU array.
write a pycuda kernel to do it.
Here is one possible example of how to do it using method 1 i.e. cumath.exp
:
$ cat t32.py
import math # all the libraries i import
import numpy as np
import pycuda.gpuarray as gpu
import pycuda.cumath as cm
import pycuda.autoinit
import pycuda.driver as drv
from pycuda.compiler import SourceModule
def PropagatorS(N, L, area, z0):
p = np.zeros((N,N), dtype = np.complex64)
for ii in range(0, N, 1):
for jj in range(0, N, 1):
u = (ii - N/2 - 1)/area
v = (jj - N/2 - 1)/area
p[ii, jj] = 1j*np.pi*L*z0*(u**2 + v**2)
q = gpu.to_gpu(p)
r = cm.exp(q)
return r
p = PropagatorS(4, 700*10**-9, 0.002, 0.08)
print(p)
$ LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64 python t32.py
[[ 0.70264995+0.71153569j 0.84094459+0.54112124j 0.90482706+0.42577928j
0.92267275+0.385584j ]
[ 0.84094459+0.54112124j 0.93873388+0.34464294j 0.97591674+0.21814324j
0.98456436+0.17502306j]
[ 0.90482706+0.42577928j 0.97591674+0.21814324j 0.99613363+0.0878512j
0.99903291+0.04396812j]
[ 0.92267275+0.385584j 0.98456436+0.17502306j 0.99903291+0.04396812j
1.00000000+0.j ]]