rr-caretpls

PLS-DA individuals plot with caret package in R


I would like to make a plot of the individuals for PLS-DA with the caret package in R (similar to PCA plot) and add a color for different groups (see picture attached, this is an example for PCA but I would like the same kind of graph for PLS-DA). Can someone help me with this?

PCA plot

Here is a code to generate random data similar to the data I used. Ycalib contains a vector variable with 2 levels, Xcalib contains 539 spectral wavelengths (below the code generates 10 wavelengths).

set.seed(1001)
Ycalib <- data.frame(
    y = sample(c("0", "1"), 10, replace = TRUE)

)

set.seed(1001)
Xcalib <- data.frame(
    x1 = sample(1:10),
    x2 = sample(1:10),
    x3 = sample(1:10),
    x4 = sample(1:10),
    x5 = sample(1:10),
    x6 = sample(1:10),
    x7 = sample(1:10),
    x8 = sample(1:10),
    x9 = sample(1:10),
    x10= sample(1:10)

)

Here is my code for PLS-DA in caret:

library(caret)
set.seed(1001) 
ctrl<-trainControl(method="repeatedcv",number=10,classProbs = TRUE,summaryFunction = twoClassSummary) 
plsda<-train(x=Xcalib, # spectral data
              y=Ycalib, # factor vector
              method="pls", # pls-da algorithm
              tuneLength=10, # number of components
              trControl=ctrl, # ctrl contained cross-validation option
              preProc=c("center","scale"), # the data are centered and scaled
              metric="ROC") # metric is ROC for 2 classes
plsda

I hope it is clear enough as I am a beginner with R.


Solution

  • The object you need to extract from the train model is called finalModel. Using your object names above, it is extracted as follows:

    dfscores <- as.data.frame(plsda$finalModel$scores)
    

    Then, you can bind this to your original data, and you can plot as you wish, e.g., using ggplot2.

    p <- ggplot(dfscores, aes(x=`Comp 1`,y=`Comp 2`, color = Ycalib)) + geom_point()