rggplot2pcairis-datasetbiplot

Using different scales for secondary axis for ggplot function in R


I am trying to create a biplot from iris data set using ggplot2 package. I have used below code to generate the biplot:

library(ggplot2)
library(devtools)


# Load iris dataset
data(iris)


# Run PCA and extract scores and loadings
iris_pca <- prcomp(iris[-5], scale. = TRUE)

scores <- as.data.frame(iris_pca$x) 
scores$Species <- iris$Species

loadings <- iris_pca$rotation

# Create biplot
biplot <- ggplot(data = scores, aes(x = PC1, y = PC2)) +
          # Scores on primary scales
          geom_point(aes(color = Species)) +
          # Loadings on secondary scales
          geom_segment(aes(x = 0, y = 0, xend = loadings[1,1], yend = loadings[1,2]), 
                       arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
          geom_segment(aes(x = 0, y = 0, xend = loadings[2,1], yend = loadings[2,2]), 
                       arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
          geom_segment(aes(x = 0, y = 0, xend = loadings[3,1], yend = loadings[3,2]), 
                       arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
          geom_segment(aes(x = 0, y = 0, xend = loadings[4,1], yend = loadings[4,2]), 
                       arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
          # Primary scales
          scale_x_continuous(limits = c(-3, 3), name = "PC1") +
          scale_y_continuous(limits = c(-3, 3), name = "PC2") +
          # Secondary scales
          scale_x_continuous(sec.axis = sec_axis(~ . / 1.2, name = "Loadings on PC1")) +
          scale_y_continuous(sec.axis = sec_axis(~ . / 1.2, name = "Loadings on PC2")) +
          # Theme
          theme_bw()

biplot

The above code results in a biplot as shown below:

Boiplot

How can I use a different secondary axis scale (limits = c(-0.8, 0.8)) which only affects zooming in the arrows and does not affect the primary scale (also not the scores or points)? Is there any possible way to achieve this? I would be thankful for your cooperation.

Regards, Farhan


Solution

  • A secondary scale can't be specified independent from the primary scale, i.e. the secondary scale always derives from the primary according to transformation specified via sec_axis(). This said, both the primary and the secondary scale have to be specified via one scale_xxx_continuous command. Moreover, the transformation specified via sec_axis() will only affect the breaks, the limits and the labels of the axis. It will not touch the data. Instead you have to take care of that by appropriately transforming the data using the inverse transformation applied on the scale. Finally, I simplified your code a bit by using just one geom_segment to add the arrows.

    library(ggplot2)
    #> Warning: package 'ggplot2' was built under R version 4.2.3
    
    loadings <- as.data.frame(iris_pca$rotation)
    loadings$Species <- rownames(loadings)
    
    scale <- 2
    # Create biplot
    ggplot(data = scores, aes(x = PC1, y = PC2)) +
      geom_point(aes(color = Species)) +
      geom_segment(
        data = loadings, aes(
          x = 0, y = 0,
          xend = PC1 * scale, yend = PC2 * scale
        ),
        arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)
      ) +
      # Primary scales
      scale_x_continuous(
        limits = c(-3, 3), name = "PC1",
        sec.axis = sec_axis(~ . / scale, name = "Loadings on PC1")
      ) +
      scale_y_continuous(
        limits = c(-3, 3), name = "PC2",
        sec.axis = sec_axis(~ . / scale, name = "Loadings on PC2")
      ) +
      # Theme
      theme_bw()
    #> Warning: Removed 1 rows containing missing values (`geom_point()`).