How to setup a batched matrix multiplication in Numba with np.dot() using contiguous arrays

I am trying to speed up a batched matrix multiplication problem with numba, but it keeps telling me that it's faster with contiguous code.

Note: I'm using numba version 0.55.1, and numpy version 1.21.5

Here's the problem:

import numpy as np
import numba as nb

def numbaFastMatMult(mat,vec):
    result = np.zeros_like(vec)
    for n in nb.prange(vec.shape[0]):
        result[n,:] = np.dot(vec[n,:], mat[n,:,:])
    return result

D,N = 10,1000
mat = np.random.normal(0,1,(N,D,D))
vec = np.random.normal(0,1,(N,D))

result = numbaFastMatMult(mat,vec)
print(mat.data.contiguous)
print(vec.data.contiguous)
print(mat[n,:,:].data.contiguous)
print(vec[n,:].data.contiguous)

clearly all the relevant data is contiguous (run the above code snippet and see the results of print()...

But, when I run this code, I get the following warning:

NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float64, 1d, C), array(float64, 2d, A))
  result[n,:] = np.dot(vec[n,:], mat[n,:,:])

2 Extra comments:

This is just a toy problem for replication. I'm actually using something with many more data points, so hoping this will speed up.
I think the "right" way to solve this is with np.tensordot. However, I want to understand what's going on for future reference. For example, this discussion addresses a similar issue, but as far as I can tell, doesn't address why the warning shows up directly.

I've tried adding a decorator:

nb.float64[:,::1](nb.float64[:,:,::1],nb.float64[:,::1]),

I've tried reordering the arrays so the batch index is first (n in the above code) I've tried printing whether the "mat" variable is contiguous from inside the function

Solution

I'll leave this up, but I figured it out:

Outside of a numba function:

mat[n,:,:].data.contiguous==True

but inside numba, mat[n,:,:] is no longer continous.

Changing my code above to np.dot(vec[n], mat[n]) removed the warning.

I'm making this the "correct" answer since it solved my problem. However, according to max9111's response, this behavior may be a bug!