I am trying to speed up a batched matrix multiplication problem with numba, but it keeps telling me that it's faster with contiguous code.
Note: I'm using numba version 0.55.1, and numpy version 1.21.5
Here's the problem:
import numpy as np
import numba as nb
def numbaFastMatMult(mat,vec):
result = np.zeros_like(vec)
for n in nb.prange(vec.shape[0]):
result[n,:] = np.dot(vec[n,:], mat[n,:,:])
return result
D,N = 10,1000
mat = np.random.normal(0,1,(N,D,D))
vec = np.random.normal(0,1,(N,D))
result = numbaFastMatMult(mat,vec)
print(mat.data.contiguous)
print(vec.data.contiguous)
print(mat[n,:,:].data.contiguous)
print(vec[n,:].data.contiguous)
clearly all the relevant data is contiguous (run the above code snippet and see the results of print()...
But, when I run this code, I get the following warning:
NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float64, 1d, C), array(float64, 2d, A))
result[n,:] = np.dot(vec[n,:], mat[n,:,:])
2 Extra comments:
I've tried adding a decorator:
nb.float64[:,::1](nb.float64[:,:,::1],nb.float64[:,::1]),
I've tried reordering the arrays so the batch index is first (n in the above code) I've tried printing whether the "mat" variable is contiguous from inside the function
I'll leave this up, but I figured it out:
Outside of a numba function:
mat[n,:,:].data.contiguous==True
but inside numba, mat[n,:,:]
is no longer continous.
Changing my code above to np.dot(vec[n], mat[n])
removed the warning.
I'm making this the "correct" answer since it solved my problem. However, according to max9111's response, this behavior may be a bug!