rstatisticspls

What do x.scores and y.scores represent in Partial Least Squares Regression in R?


I am analyzing some data in R using Partial Least Squares Regression. As I complete the regression, I stumble upon two matrices called "x.scores" and "y.scores". What are they and what do they represent?

#Input:
install.packages("plsdepot")    
library("plsdepot")
plsExample = plsreg2(data.frame.x, data.frame.y, comps = numComponents)
summary(plsExample)

#Output:
          Length Class  Mode   
x.scores   50    -none- numeric
x.loads    10    -none- numeric
y.scores   50    -none- numeric
y.loads    10    -none- numeric
cor.xt     10    -none- numeric
cor.yt     10    -none- numeric
cor.xu     10    -none- numeric
cor.yu     10    -none- numeric
cor.tu      4    -none- numeric

Solution

  • X-scores, usually denoted as T, are the predictors of Y and at the same time they model X. X-scores are the linear combinations of original X variables estimated with the weights coefficients denoted as w. In the same way Y-scores, denoted as , multiplied by the weights c summarize Y variables.

    In matrix notation, the desired decompositions have the following expressions:

    X = TP + E

    Y = UC + F

    The expression above is interpreted as follows: matrix X is decomposed into the score matrix T, loading the matrix P and the error matrix E. Similarly, Y matrix is decomposed into the score matrix U, loading the matrix Q and into the error matrix F.

    So in short: x.scores contain the extracted PLS components and y.scores contain U components associated to the response variable.

    For more in-depth explanation see:

    https://hrcak.srce.hr/94324?lang=en https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated

    And also this literature:

    Geladi P., Kowalski B (1986) Partial Least Squares Regression: A tutorial.Analytica ChimicaActa, 185: 1-17.

    Tenenhaus M. (1998)La Regression PLS: Theorie et pratique.Paris: Editions TECHNIP