vectorpytorchsvdorthogonalnumerical-stability

Pytorch torch.linalg.svd returning U and V^T, which are not orthogonal


Using U,S, VT = torch.linalg.svd(M), the matrix 'M' is large, so I am getting the matrices U and VT as non orthogonal. When I compute torch.norm(torch.mm(matrix, matrix.t()) - identity_matrix)) its 0.004 and also when I print M.M^T, the diagonal entries are not 1, rather 0.2 or 0.4 and non diagonals are not 0, but 0.0023. IS there a way to get SVD with orthogonal U and V^T ? But the singular values i.e. diagonal elements of S are nera to 1 only.

matrix = torch.randn(4096, 4096) 
u, s, vh = torch.linalg.svd(matrix) 
matrix = torch.mm(u, vh)
print('norm ||WTW - I||: ',torch.norm(torch.mm(matrix, matrix.t()) - torch.eye(matrix.shape[0]))) 
print(matrix)

I have done some numerical analysis, and it seems Pytorch's linalg_svd is not returning orthogonal u and vh. Can others verify this behaviour is with others too or I am doing something wrong?

Matlab: I tried inbuilt svd decomposition in matlab, and there norm(u*transpose(u) - eye(4096)), there its 1E-13.


Solution

  • Why do you expect matrix @ matrix.T to be close to I?

    SVD is a decomposition of the input matrix matrix. It does not alter it, it only produces three matrices u, s and vh s.t. matrix = u @ s @ vh. The special thing about SVD is that the matrices u, s and vh are not arbitrary, but unique: u and v are orthogonal, and s is diagonal.

    What you should actually expect is:

    matrix = torch.randn(4096, 4096) 
    u, s, vh = torch.linalg.svd(matrix)
    
    print(f'||uuT - I|| = {torch.norm(u@u.t() - torch.eye(u.shape[0]))}')
    print(f'||vvT - I|| = {torch.norm(vh.t()@vh - torch.eye(vh.shape[0]))}')
    

    Note that due to numeric issues the difference ||uuT -I|| is not likely to be exactly zero, but some small number depending on the dimensions of your matrix (the larger the matrix -- the greater the error), and the precision of the dtype you used: float32 (aka single) will likely to result with larger error compared to float64 (aka double).


    PS, the operator @ stands for matrix multiplication.