pythonarraysnumpynumbacontiguous

NumPy arrays from Numba-accelerated QR decomposition are not contiguous


I encounter a strange warning when performing matrix multiplication after QR decomposition in a Numba-accelerated function. For example:

# Python 3.10

import numpy as np
from numba import jit

@jit
def qr_check(x):
    q,r = np.linalg.qr(x)
    return q @ r

x = np.random.rand(3,3)
qr_check(x)

Running the above code, I get the following NumbaPerformanceWarning:

'@' is faster on contiguous arrays, called on (array(float64, 2d, A), array(float64, 2d, F))

I'm not sure what's going wrong here. I know F is for Fortran, so array r is Fortran-contiguous, but why isn't array q as well?


Solution

  • It is about the details of how QR decomposition is implemented in numba.

    As you noted F - stands for Fortran-contiguous (column-major).

    A stands for strided memory layout.

    Numba does not call numpy.linalg.qr directly. Let's take a look into source code of numba:

    https://github.com/numba/numba/blob/251061051ea13c8618c5fbd5e6b3f90a3315fec9/numba/np/linalg.py#L1418

    @overload(np.linalg.qr)
    def qr_impl(a):
        ...
    

    As you can see numba overloads the function qr. Inside this function numba calls lapack function for QR decomposition which is implemented in FORTRAN so the result is Fortran-contiguous. But additionally q is sliced:

    q[:, :minmn]
    

    https://github.com/numba/numba/blob/251061051ea13c8618c5fbd5e6b3f90a3315fec9/numba/np/linalg.py#L1490

    So the final layouts are:

    A (strided) for Q

    F (fortran) for R

    You will get the same warning in a similar case with a matrix product:

    @jit
    def qr_check(x):
        q = np.zeros((100, 64))
        r = np.zeros((64, 200))
        return q @ r[:1000, :1000]