I try to compute the proportion of the variance explained by each component in a PLSDA, using the tidymodels
framework.
Here's the "gold standard" result with the mixOmics
package:
library(mixOmics)
mix_plsda <- plsda(X = iris[-5], Y = iris$Species, ncomp = 4)
mix_var_expl <- mix_plsda$prop_expl_var$X
mix_var_expl
#> comp1 comp2 comp3 comp4
#> 0.729028323 0.227891235 0.037817718 0.005262724
sum(mix_var_expl) # check
#> [1] 1
And here with recipes::step_pls()
:
library(recipes)
tidy_plsda <-
recipe(Species ~ ., data = iris) %>%
step_pls(all_numeric_predictors(), outcome = "Species", num_comp = 4) %>%
prep()
tidy_sd <- tidy_plsda$steps[[1]]$res$sd
tidy_sd
#> [1] 0.8280661 0.4358663 1.7652982 0.7622377
tidy_sd ^2 / sum(tidy_sd^2)
#> [1] 0.14994532 0.04154411 0.68145793 0.12705264
The element that looks like the most to an explained variance is sd
, but as you can see, there is no obvious relationship between these two vectors.
How can I get mix_var_expl
from tidy_plsda
? Thanks!
Created on 2022-09-20 by the reprex package (v2.0.1)
The recipe object does not save the mixOmics model; just the parts that we need to process new data. The sd
object is the standard deviations of the predictors. There is no current way to get what you want from the object.
I've added a GitHub issue to add more objects to the results though: https://github.com/tidymodels/recipes/issues/1038