rstatisticsrefactoringlinear-regressioninteraction

Why summary a linear model in R does not show all the needed levels?


I am trying to do a linear model in R. I have 24 experiments (complete factorial analysis). I have 3 factors on this model. However, the Density factor has 3 levels (B, M, A). I know that DensityB is not needed to be appeared because if the DensityM and DensityA has a 0 value, DensityB is activated indirectly. But in the interaction we need DensityB:MatS. Because if we have MatN we can activate it using a 0. However this happens:

            Estimate Std. Error t value Pr(>|t|)    

(Intercept)    0.35500    0.06094   5.826 2.03e-05 ***

Thickness2     0.11516    0.04606   2.500  0.02294 *  

DensityM      -0.05080    0.07978  -0.637  0.53279    

DensityA      -0.24315    0.07978  -3.048  0.00728 ** 

MatS           0.22882    0.07978   2.868  0.01066 *  

**DensityM:MatS** -0.21393    0.11283  -1.896  0.07509 . 
 
**DensityA:MatS** -0.27452    0.11283  -2.433  0.02631 *

It does not happen when I don't reorder the levels of the factor using this:

df$Density  = factor(df$Density, levels=c("B", "M", "A"))

When I don't use it, these are the results:

(Intercept)    0.11185    0.06094   1.835  0.08399 . 

Thickness2     0.11516    0.04606   2.500  0.02294 * 

DensityB       0.24315    0.07978   3.048  0.00728 **

DensityM       0.19235    0.07978   2.411  0.02751 * 

**DensityA:MatS** -0.04570    0.07978  -0.573  0.57426   

**DensityB:MatS**  0.22882    0.07978   2.868  0.01066 * 

**DensityM:MatS**  0.01489    0.07978   0.187  0.85412 

And they are correct.

Why reording the levels of the factor change this interaction? I need to reorder the levels because I want DensistyM and DensityA to appear in the linear model (and DensityB as the lower level; so if DensityM and DensityA worth 0, DensistyB is activated).

The adjusted square R and the p-value of the linear model are the same.


Solution

  • This is simply a consequence of over parameterisation and is nothing to worry about. Your modelling code is simply taking the final level of your factor Density as the reference levels. The effects of the other levels are simply differences from the reference level.

    To see this, in your first model, with "B" as the reference level, the difference between "A" and "M" is -0.05080 - -0.24315 = 0.19235. In your second model, with "A" as the reference level, the coefficient of "M" (ie the estimated difference between "A" and "M") is 0.19235. Exactly the same value.

    You can work out the value of any effect you like from either model, and the two values will be identical. You just need to take account of the parametrisation that the model has used.