rrasterspatialpca

running a PCA on a RasterStack in R


I am attempting to run a PCA on a set of the variables available in Bioclim. I noted that rasterPCA() is no longer available in R as its package has been discontinued, and it looks like that is because some issues with that package were not resolved. Below, I attempted to run a PCA on the RasterStack directly (this is necessary in this case, because if I convert to a data frame I will lose the spatial information about each row, and my goal is to create a new rasterstack with all of the PCs that this PCA will generate).


#Libraries:

library(geodata)
library(raster)

#Downloading the data:

bioclim_all <- worldclim_global(var = "bio", 
                                res = 0.5, 
                                path = "/data")

#Creating a bounding box:

bounding_box <- extent(x = c(-118.2724, -86.4236, 14.3237, 32.4306))

#Cropping to a smaller resolution:

crop_bioclim <- crop(x = bioclim_all, y = bounding_box)

#Conduct a PCA with standardization:

pca <- prcomp(crop_bioclim, center = TRUE, scale = TRUE)


However, I get the following error:

Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x'

Does anyone know how to make this direct PCA analysis work, in a similar vein to how rasterPCA used to work? Or a way of preserving spatial data when running the PCA?


Solution

  • I am using "terra", the replacement of "raster". You can directly use prcomp or princomp with a SpatRaster.

    Example data (always include some when asking an R question)

    library(terra)
    logo <- rast(system.file("ex/logo.tif", package="terra"))   
    

    Solution:

    pca <- prcomp(logo)
    
    pca
    #Standard deviations (1, .., p=3):
    #[1] 124.814772  17.084151   1.456423
    #
    #Rotation (n x k) = (3 x 3):
    #             PC1        PC2        PC3
    #red   -0.5935507  0.5073220 -0.6247575
    #green -0.5846008  0.2617406  0.7679412
    #blue  -0.5531179 -0.8210458 -0.1412245
    

    Now you can use predict to map the principal components:

    x <- predict(logo, pca)
    plot(x)
    

    enter image description here

    If you have a very large raster you can use princomp instead or take a sample of the cells. For example

    pca <- prcomp(logo, maxcell=100000)
    

    To make a SpatRaster from your RasterStack s you can do r <- rast(s) and to go the other way you can do ss <- stack(r)