I am trying to make PCA biplot by using follwoing code from the given data. In the data there are 4 type of genotypes, belongs to 4 type of species and 4 types of variables were evaluated (SPAD, PN, Y(II), DMC) under 2 type of conditions 1= control, 2=Stress. I successfully make the PCA biplot as shown in picture But I am havng trouble with variable labels shown alongwith the arrows. I want to convert the #1 and #2 into SUPERCRIPT CT and HS or I want to keep the original name for example "Y(II)" but in the grpah it always convert to "Y.II.". Also I want to change the color of the individual text as they are shown in Parameter legends and also keep the SPecies legend color to differentiate.
pca <- PCA(data.frame(data[,-2], row.names = 1), ncp=7, graph = TRUE, scale.unit = TRUE)
SPAD <- "Chlorophyll Index"
IRGA <- "Gas exchange"
CF <- "Chlorophyll fluorescence"
ag <- "Morphological traits"
traits <- factor(c(SPAD,IRGA,CF,ag,SPAD, IRGA,CF,ag))
fviz_pca_biplot(pca,
geom.ind = c("point","text"),
pointshape = 21,
pointsize = 2.5,
fill.ind = data$Species,
col.ind = "black",
col.var = traits,
legend.title = list(fill = "Species", color = "Parameters"),
repel = TRUE )+
ggpubr::fill_palette("cosmic")+ # Indiviual fill color
ggpubr::color_palette(c("brown", "purple", "red","blue")) + # Variable colors
theme_gray() +
theme(legend.position = "right",
legend.text = element_text(face="italic"),
plot.caption = element_text(hjust = 0),
legend.key.size = unit(0.5, 'cm'),
legend.background = element_rect(fill='transparent'),
panel.background = element_rect(colour = "grey30")) +
labs(title = "", x= "PC1 (62.56%)", y= "PC2 (29.10%)",
caption = "*1: Control , 2: Heat stress")
Genotype | Species | SPAD1 | Pn1 | Y(II)1 | DMC1 | SPAD2 | Pn2 | Y(II)2 | DMC2 |
---|---|---|---|---|---|---|---|---|---|
BEL | sp1 | 0.6 | 14.38 | 0.25 | 0.21 | 1.64 | 16.5 | 0.29 | -0.4 |
BGB003 | sp2 | -0.24 | 14.87 | 0.2 | -1.24 | -0.33 | 16.63 | 0.27 | -1.24 |
BGB008 | sp2 | -0.54 | 11.92 | 0.14 | -1.24 | -0.37 | 12.6 | 0.15 | -0.72 |
BGB083 | sp3 | -1.18 | 6.61 | 0.13 | 0.74 | -0.04 | 5.41 | 0.16 | 0.63 |
BGB086 | sp3 | -0.89 | 9.05 | 0.19 | -0.53 | -0.33 | 11.2 | 0.17 | -0.28 |
BGB088 | sp4 | -0.4 | 8.75 | 0.15 | -0.39 | 0.28 | 12.36 | 0.22 | -0.6 |
BGB089 | sp4 | -0.52 | 9.86 | 0.2 | -0.05 | 0.47 | 11.06 | 0.19 | -0.44 |
Your variables are stored in the row names of your pca
object. You can edit them to get the plot you desire. In particular you can change e.g. DMC1
to DMC^1
and so on, to tell ggplot2
you want to use superscript:
pca$var <- lapply(pca$var, \(d) {
# Replace Y.II. with Y(II)
rownames(d) <- gsub("\\.II\\.", "(II)", rownames(d))
# Replace e.g. DMC1 with DMC^1
rownames(d) <- gsub("(1|2)$", "^\\1", rownames(d))
d
})
When drawing your plot, make sure to add parse = TRUE
, so it knows that DMC^1
should be treated as superscript.
fviz_pca_biplot(pca,
geom.ind = c("point", "text"),
pointshape = 21,
pointsize = 2.5,
fill.ind = data$Species,
col.ind = "black",
col.var = traits,
legend.title = list(fill = "Species", color = "Parameters"),
repel = TRUE,
parse = TRUE # this is important
) # + all your theme code goes here
Output: