rggplot2pcaggfortify

Reproducing loadings from autoplot in ggfortify for PCA


I'm using autoplot in ggfortify to plot a prcomp object. I also wanted to see if I could manually reproduce the plot. However, when I do this, the two plots seems to differ in terms of how the loadings are plotted. Note that the dataset is not scaled here. The dataset that I'm actually doing this exercise on has the same units for all columns, and I figure not scaling may be consequential.

library(ggplot2)
library(ggfortify)
data(mtcars)
cars_pca = prcomp(mtcars, center = TRUE, scale. = FALSE)
autoplot(cars_pca, loadings = TRUE, loadings.label = TRUE)

plot(x = cars_pca$x[, 1], y = cars_pca$x[, 2])
text(x = cars_pca$rotation[, 1], 
     y = cars_pca$rotation[, 2],
     col = alpha('red', 0.7),
     cex = 1,
     labels = colnames(mtcars))

I've read that R uses the terms loading and eigenvector interchangeably even when that's not precise. However, I'm unable to produce the ggfortify biplot even when implementing manually what's detailed in these two questions. I also tried looking for the source code for ggfortify::autoplot.prcomp but no luck.

Can someone give me some advice for how to reproduce the loadings that ggfortify is plotting? I'd really like to understand what's going on.

Thanks!


Solution

  • The transformation is happening in the ggfortify::autoplot.prcomp function. Since you are using the default value of scale=1 then the values are adjusted. Here's how to recover the same points plotted during the autoplot. The loadings are transformed in the ggfortify::ggbiplot function

    scaler <- cars_pca$sdev[c(1,2)] * sqrt(nrow(cars_pca$x))
    trans <- t(t(cars_pca$x[, 1:2]) / scaler)
    scaler2 <- min(max(abs(trans[, 1L])) / max(abs(cars_pca$rotation[, 1L])),
                  max(abs(trans[, 2L])) / max(abs(cars_pca$rotation[, 2L])))
    trans2 <- cars_pca$rotation[, 1L:2L] * scaler2 * 0.8
    
    plot(x = trans[, 1], y = trans[, 2])
    text(x = trans2[, 1], 
         y = trans2[, 2],
         col = alpha('red', 0.7),
         cex = 1,
         labels = colnames(mtcars))
    

    enter image description here