I am trying to create a biplot from iris data set using ggplot2 package. I have used below code to generate the biplot:
library(ggplot2)
library(devtools)
# Load iris dataset
data(iris)
# Run PCA and extract scores and loadings
iris_pca <- prcomp(iris[-5], scale. = TRUE)
scores <- as.data.frame(iris_pca$x)
scores$Species <- iris$Species
loadings <- iris_pca$rotation
# Create biplot
biplot <- ggplot(data = scores, aes(x = PC1, y = PC2)) +
# Scores on primary scales
geom_point(aes(color = Species)) +
# Loadings on secondary scales
geom_segment(aes(x = 0, y = 0, xend = loadings[1,1], yend = loadings[1,2]),
arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
geom_segment(aes(x = 0, y = 0, xend = loadings[2,1], yend = loadings[2,2]),
arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
geom_segment(aes(x = 0, y = 0, xend = loadings[3,1], yend = loadings[3,2]),
arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
geom_segment(aes(x = 0, y = 0, xend = loadings[4,1], yend = loadings[4,2]),
arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)) +
# Primary scales
scale_x_continuous(limits = c(-3, 3), name = "PC1") +
scale_y_continuous(limits = c(-3, 3), name = "PC2") +
# Secondary scales
scale_x_continuous(sec.axis = sec_axis(~ . / 1.2, name = "Loadings on PC1")) +
scale_y_continuous(sec.axis = sec_axis(~ . / 1.2, name = "Loadings on PC2")) +
# Theme
theme_bw()
biplot
The above code results in a biplot as shown below:
How can I use a different secondary axis scale (limits = c(-0.8, 0.8)
) which only affects zooming in the arrows and does not affect the primary scale (also not the scores or points)? Is there any possible way to achieve this? I would be thankful for your cooperation.
Regards, Farhan
A secondary scale can't be specified independent from the primary scale, i.e. the secondary scale always derives from the primary according to transformation specified via sec_axis()
. This said, both the primary and the secondary scale have to be specified via one scale_xxx_continuous
command. Moreover, the transformation specified via sec_axis()
will only affect the breaks, the limits and the labels of the axis. It will not touch the data. Instead you have to take care of that by appropriately transforming the data using the inverse transformation applied on the scale. Finally, I simplified your code a bit by using just one geom_segment
to add the arrows.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.2.3
loadings <- as.data.frame(iris_pca$rotation)
loadings$Species <- rownames(loadings)
scale <- 2
# Create biplot
ggplot(data = scores, aes(x = PC1, y = PC2)) +
geom_point(aes(color = Species)) +
geom_segment(
data = loadings, aes(
x = 0, y = 0,
xend = PC1 * scale, yend = PC2 * scale
),
arrow = arrow(length = unit(0.3, "cm"), type = "closed", angle = 25)
) +
# Primary scales
scale_x_continuous(
limits = c(-3, 3), name = "PC1",
sec.axis = sec_axis(~ . / scale, name = "Loadings on PC1")
) +
scale_y_continuous(
limits = c(-3, 3), name = "PC2",
sec.axis = sec_axis(~ . / scale, name = "Loadings on PC2")
) +
# Theme
theme_bw()
#> Warning: Removed 1 rows containing missing values (`geom_point()`).