matlableast-squaressvdmatrix-decompositionqr-decomposition

My example shows SVD is less numerically stable than QR decomposition


I asked this question in Math Stackexchange, but it seems it didn't get enough attention there so I am asking it here. https://math.stackexchange.com/questions/1729946/why-do-we-say-svd-can-handle-singular-matrx-when-doing-least-square-comparison?noredirect=1#comment3530971_1729946

I learned from some tutorials that SVD should be more stable than QR decomposition when solving Least Square problem, and it is able to handle singular matrix. But the following example I wrote in matlab seems to support the opposite conclusion. I don't have a deep understanding of SVD, so if you could look at my questions in the old post in Math StackExchange and explain it to me, I would appreciate a lot.

I use a matrix that have a large condition number(e+13). The result shows SVD get a much larger error(0.8) than QR(e-27)

% we do a linear regression between Y and X
data= [
47.667483331 -122.1070832;
47.667483331001 -122.1070832
];
X = data(:,1);
Y = data(:,2);

X_1 =  [ones(length(X),1),X];

%%
%SVD method
[U,D,V] = svd(X_1,'econ');
beta_svd = V*diag(1./diag(D))*U'*Y;


%% QR method(here one can also use "\" operator, which will get the same result as I tested. I just wrote down backward substitution to educate myself)
[Q,R] = qr(X_1)
%now do backward substitution
[nr nc] = size(R)
beta_qr=[]
Y_1 = Q'*Y
for i = nc:-1:1
    s = Y_1(i)
    for j = m:-1:i+1
        s = s - R(i,j)*beta_qr(j)
    end
    beta_qr(i) = s/R(i,i)
end

svd_error = 0;
qr_error = 0;
for i=1:length(X)
   svd_error = svd_error + (Y(i) - beta_svd(1) - beta_svd(2) * X(i))^2;
   qr_error = qr_error + (Y(i) - beta_qr(1) - beta_qr(2) * X(i))^2;
end

Solution

  • You SVD-based approach is basically the same as the pinv function in MATLAB (see Pseudo-inverse and SVD). What you are missing though (for numerical reasons) is using a tolerance value such that any singular values less than this tolerance are treated as zero.

    If you refer to edit pinv.m, you can see something like the following (I won't post the exact code here because the file is copyrighted to MathWorks):

    [U,S,V] = svd(A,'econ');
    s = diag(S);
    tol = max(size(A)) * eps(norm(s,inf));
    % .. use above tolerance to truncate singular values
    invS = diag(1./s);
    out = V*invS*U';
    

    In fact pinv has a second syntax where you can explicitly specify the tolerance value pinv(A,tol) if the default one is not suitable...


    So when solving a least-squares problem of the form minimize norm(A*x-b), you should understand that the pinv and mldivide solutions have different properties:


    Using your example (note that rcond(A) is very small near machine epsilon):

    data = [
        47.667483331    -122.1070832;
        47.667483331001 -122.1070832
    ];
    A = [ones(size(data,1),1), data(:,1)];
    b = data(:,2);
    

    Let's compare the two solutions:

    x1 = A\b;
    x2 = pinv(A)*b;
    

    First you can see how mldivide returns a solution x1 with one zero component (this is obviously a valid solution because you can solve both equations by multiplying by zero as in b + a*0 = b):

    >> sol = [x1 x2]
    sol =
     -122.1071   -0.0537
             0   -2.5605
    

    Next you see how pinv returns a solution x2 with a smaller norm:

    >> nrm = [norm(x1) norm(x2)]
    nrm =
      122.1071    2.5611
    

    Here is the error of both solutions which is acceptably very small:

    >> err = [norm(A*x1-b) norm(A*x2-b)]
    err =
       1.0e-11 *
             0    0.1819
    

    Note that use mldivide, linsolve, or qr will give pretty much same results:

    >> x3 = linsolve(A,b)
    Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND =  2.159326e-16. 
    x3 =
     -122.1071
             0
    
    >> [Q,R] = qr(A); x4 = R\(Q'*b)
    x4 =
     -122.1071
             0