I encounter a strange warning when performing matrix multiplication after QR decomposition in a Numba-accelerated function. For example:
# Python 3.10
import numpy as np
from numba import jit
@jit
def qr_check(x):
q,r = np.linalg.qr(x)
return q @ r
x = np.random.rand(3,3)
qr_check(x)
Running the above code, I get the following NumbaPerformanceWarning
:
'@' is faster on contiguous arrays, called on (array(float64, 2d, A), array(float64, 2d, F))
I'm not sure what's going wrong here. I know F is for Fortran, so array r
is Fortran-contiguous, but why isn't array q
as well?
It is about the details of how QR decomposition is implemented in numba.
As you noted F
- stands for Fortran-contiguous (column-major).
A
stands for strided memory layout.
Numba does not call numpy.linalg.qr
directly. Let's take a look into source code of numba:
@overload(np.linalg.qr)
def qr_impl(a):
...
As you can see numba
overloads the function qr
.
Inside this function numba calls lapack function for QR decomposition which is implemented in FORTRAN so the result is Fortran-contiguous. But additionally q is sliced:
q[:, :minmn]
So the final layouts are:
A (strided) for Q
F (fortran) for R
You will get the same warning in a similar case with a matrix product:
@jit
def qr_check(x):
q = np.zeros((100, 64))
r = np.zeros((64, 200))
return q @ r[:1000, :1000]