pythonpcascikits

How to whiten matrix in PCA


I'm working with Python and I've implemented the PCA using this tutorial.

Everything works great, I got the Covariance I did a successful transform, brought it make to the original dimensions not problem.

But how do I perform whitening? I tried dividing the eigenvectors by the eigenvalues:

S, V = numpy.linalg.eig(cov)
V = V / S[:, numpy.newaxis]

and used V to transform the data but this led to weird data values. Could someone please shred some light on this?


Solution

  • Here's a numpy implementation of some Matlab code for matrix whitening I got from here.

    import numpy as np
    
    def whiten(X,fudge=1E-18):
    
       # the matrix X should be observations-by-components
    
       # get the covariance matrix
       Xcov = np.dot(X.T,X)
    
       # eigenvalue decomposition of the covariance matrix
       d, V = np.linalg.eigh(Xcov)
    
       # a fudge factor can be used so that eigenvectors associated with
       # small eigenvalues do not get overamplified.
       D = np.diag(1. / np.sqrt(d+fudge))
    
       # whitening matrix
       W = np.dot(np.dot(V, D), V.T)
    
       # multiply by the whitening matrix
       X_white = np.dot(X, W)
    
       return X_white, W
    

    You can also whiten a matrix using SVD:

    def svd_whiten(X):
    
        U, s, Vt = np.linalg.svd(X, full_matrices=False)
    
        # U and Vt are the singular matrices, and s contains the singular values.
        # Since the rows of both U and Vt are orthonormal vectors, then U * Vt
        # will be white
        X_white = np.dot(U, Vt)
    
        return X_white
    

    The second way is a bit slower, but probably more numerically stable.