numpyscipyleast-squares

Numpy/Scipy : solving several least squares with the same design matrix


I face a least square problem that i solve via scipy.linalg.lstsq(M,b), where :

The issue is that i have to solve it a bunch of time for different b's. How can i do something more efficient ? I guess that lstsq does a lot of things independently of the value of b.

Ideas ?


Solution

  • In the case your linear system is well-determined, I'll store M LU decomposition and use it for all the b's individually or simply do one solve call for 2d-array B representing the horizontally stacked b's, it really depends on your problem here but this is globally the same idea. Let's suppose you've got each b one at a time, then:

    import numpy as np
    from scipy.linalg import lstsq, lu_factor, lu_solve, svd, pinv
    
    # as you didn't specified any practical dimensions
    n = 100
    # number of b's
    nb_b = 10
    
    # generate random n-square matrix M
    M = np.random.rand(n**2).reshape(n,n)
    
    # Set of nb_b of right hand side vector b as columns
    B = np.random.rand(n*nb_b).reshape(n,nb_b)
    
    # compute pivoted LU decomposition of M
    M_LU = lu_factor(M)
    # then solve for each b
    X_LU = np.asarray([lu_solve(M_LU,B[:,i]) for i in range(nb_b)])
    

    but if it is under or over-determined, you need to use lstsq as you did:

    X_lstsq = np.asarray([lstsq(M,B[:,i])[0] for i in range(nb_b)])
    

    or simply store the pseudo-inverse M_pinv with pinv (built on lstsq) or pinv2 (built on SVD):

    # compute the pseudo-inverse of M
    M_pinv = pinv(M)
    X_pinv = np.asarray([np.dot(M_pinv,B[:,i]) for i in range(nb_b)])
    

    or you can also do the work by yourself, as in pinv2 for instance, just store the SVD of M, and solve this manually:

    # compute svd of M
    U,s,Vh = svd(M)
    
    def solve_svd(U,s,Vh,b):
        # U diag(s) Vh x = b <=> diag(s) Vh x = U.T b = c
        c = np.dot(U.T,b)
        # diag(s) Vh x = c <=> Vh x = diag(1/s) c = w (trivial inversion of a diagonal matrix)
        w = np.dot(np.diag(1/s),c)
        # Vh x = w <=> x = Vh.H w (where .H stands for hermitian = conjugate transpose)
        x = np.dot(Vh.conj().T,w)
        return x
    
    X_svd = np.asarray([solve_svd(U,s,Vh,B[:,i]) for i in range(nb_b)])
    

    which all give the same result if checked with np.allclose (unless the system is not well-determined resulting in the LU direct approach failure). Finally in terms of performances:

    %timeit M_LU = lu_factor(M); X_LU = np.asarray([lu_solve(M_LU,B[:,i]) for i in range(nb_b)])
    1000 loops, best of 3: 1.01 ms per loop
    
    %timeit X_lstsq = np.asarray([lstsq(M,B[:,i])[0] for i in range(nb_b)])
    10 loops, best of 3: 47.8 ms per loop
    
    %timeit M_pinv = pinv(M); X_pinv = np.asarray([np.dot(M_pinv,B[:,i]) for i in range(nb_b)])
    100 loops, best of 3: 8.64 ms per loop
    
    %timeit U,s,Vh = svd(M); X_svd = np.asarray([solve_svd(U,s,Vh,B[:,i]) for i in range(nb_b)])
    100 loops, best of 3: 5.68 ms per loop
    

    Nevertheless, it's up to you to check these with appropriate dimensions.

    Hope this helps.