rpcafactoextra

R PCA : With the fviz_pca_ind function, can we have two categorical variables: one point shape and one fill color?


I am trying to make a PCA plot with individuals -where one categorical variable (A) would be represented as the point shape (eg one group as a circle, a second one as a square, etc.) -and a second categorical variable (B) as the color inside the point Is that possible? Which code would you use?


Solution

  • I don't think you can modify the output from fviz_pca_ind(), so you would need to take out the data from the results, and plot it again using ggplot2:

    library(factoextra)
    library(ggplot2)
    
    data <- iris
    colnames(data)[5] <- "A"
    data$B <- sample(letters[1:2],nrow(data),replace=TRUE)
    
    res.pca <- prcomp(data[,1:4],  scale = TRUE)
    basic_plot <- fviz_pca_ind(res.pca, label="none")
    
    ggplot(cbind(basic_plot$data,data[,c("A","B")]),
    aes(x=x,y=y,col=A,shape=B)) + geom_point() + theme_bw()
    

    enter image description here