I know how to produce a PCA plot through ggbiplot and this package works well.
But now I want to modify some specific points, such as their color, size and especially adding circles around some points but not cover them by geom_encircle()
function.
Here is my reproducible example code below:
#load required packages
library(ggplot2)
library(devtools)
library(ggbiplot)
#load dataset
data(iris)
#perform principal component analysis
pca = prcomp(iris[ , 1:4], scale=T)
#define classes, generate & view PCA biplot
class = iris$Species
ggbiplot(pca, obs.scale = 1, var.scale = 1, groups = class, circle = FALSE)+
geom_point(size = 3,aes(color = class))+
geom_point(data=iris[iris$Species=="setosa",],pch=21, fill=NA, size=2, colour="black", stroke=2)
However, error information appeared:
Error in `geom_point()`:
! Problem while computing aesthetics.
i Error occurred in the 5th layer.
Caused by error in `FUN()`:
! object 'xvar' not found
Run `rlang::last_trace()` to see where the error occurred.
I may know it is caused by data in geom_point()
which is not consistent to pca.
But I don't know how should I set the data in geom_point()
So I hope somebody could give me some advice or solutions.
Thanks in advance.
You can do this in a hacky way by using ggplot_build()
to retrieve the data frame that was constructed by ggbiplot
.
gg0 <- ggplot(data=data,aes(x=data[,1],y=data[,2]))+
geom_point(size = 3,aes(color = class))
ggb <- ggplot_build(gg0)
ggb$data
is a list with a data frame for each layer of the plot. By poking around a bit we can figure out that the geom_point
layer is the last (fourth), i.e. ggb$data[[4]]
. All we need from this is the x
and y
coordinates, which we can combine with the original data set (hoping that row order is preserved, there weren't any incomplete cases discarded, etc.)
my_data <- cbind(iris, ggb$data[[4]][c("x", "y")])
m2 <- subset(my_data, Species == "setosa")
gg0 +
geom_encircle(data = m2, aes(x = x, y = y)) +
geom_point(data=m2, aes(x=x,y=y),
pch=21, fill=NA, size=2, colour="black", stroke=2)