rstatisticsglmnet

Why does glmnet standardisation only scale X?


I've been reading about the impact of standardizing in glmnet. It's my understanding that when we set standardize==TRUE, then we center and scale the predictors to have mean 0 and standard deviation 1. However, in the glmnet:::glmnet.path function, which is used internally to fit the model, the following code exists:

   if (intercept) {
        xm <- meansd$mean
    }
    else {
        xm <- rep(0, times = nvars)
    }
    if (standardize) {
        xs <- meansd$sd
    }
    else {
        xs <- rep(1, times = nvars)
    }

So it seems to be that the data $X$ is only scaled when standardize=TRUE, and is centered only when we also have an intercept. Why is this?


Solution

  • There is also this:

    
        # return coefficients to original scale (because of x standardization)
        beta <- beta / xs
        a0 <- a0 - colSums(beta * xm)
    
    

    which is meant to show that even if you choose to scale, then you'd need to return things to the right scale after fitting. But you want these operations not to alter anything, if you choose to not scale, hence they are not TRUE/FALSE but vectors that satisfy this piece of code (no op, because xs is ones, and xm is zeros), and also work for the scale function, as it accepts numerics as well.