numpyoptimizationnumbacontiguous

How to setup a batched matrix multiplication in Numba with np.dot() using contiguous arrays


I am trying to speed up a batched matrix multiplication problem with numba, but it keeps telling me that it's faster with contiguous code.

Note: I'm using numba version 0.55.1, and numpy version 1.21.5

Here's the problem:

import numpy as np
import numba as nb

def numbaFastMatMult(mat,vec):
    result = np.zeros_like(vec)
    for n in nb.prange(vec.shape[0]):
        result[n,:] = np.dot(vec[n,:], mat[n,:,:])
    return result

D,N = 10,1000
mat = np.random.normal(0,1,(N,D,D))
vec = np.random.normal(0,1,(N,D))

result = numbaFastMatMult(mat,vec)
print(mat.data.contiguous)
print(vec.data.contiguous)
print(mat[n,:,:].data.contiguous)
print(vec[n,:].data.contiguous)

clearly all the relevant data is contiguous (run the above code snippet and see the results of print()...

But, when I run this code, I get the following warning:

NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float64, 1d, C), array(float64, 2d, A))
  result[n,:] = np.dot(vec[n,:], mat[n,:,:])

2 Extra comments:

  1. This is just a toy problem for replication. I'm actually using something with many more data points, so hoping this will speed up.
  2. I think the "right" way to solve this is with np.tensordot. However, I want to understand what's going on for future reference. For example, this discussion addresses a similar issue, but as far as I can tell, doesn't address why the warning shows up directly.

I've tried adding a decorator:

nb.float64[:,::1](nb.float64[:,:,::1],nb.float64[:,::1]),

I've tried reordering the arrays so the batch index is first (n in the above code) I've tried printing whether the "mat" variable is contiguous from inside the function


Solution

  • I'll leave this up, but I figured it out:

    Outside of a numba function:

    mat[n,:,:].data.contiguous==True
    

    but inside numba, mat[n,:,:] is no longer continous.

    Changing my code above to np.dot(vec[n], mat[n]) removed the warning.

    I'm making this the "correct" answer since it solved my problem. However, according to max9111's response, this behavior may be a bug!