rlinear-regressionanovaoneway

Fit linear model in R for various values


In this experiment, four different diets were tried on animals. Then researchers measured their effects on blood coagulation time.

 ## Data :
    coag diet
 1    62    A
 2    60    A
 3    63    A
 4    59    A
 5    63    B
 6    67    B
 7    71    B
 8    64    B
 9    65    B
 10   66    B
 11   68    C
 12   66    C
 13   71    C
 14   67    C
 15   68    C
 16   68    C
 17   56    D
 18   62    D
 19   60    D
 20   61    D
 21   63    D
 22   64    D
 23   63    D
 24   59    D

I am trying to fit a linear model for coag~diet by using the function lm in R Results should look like the following:

> modelSummary$coefficients
                 Estimate Std. Error       t value     Pr(>|t|)
(Intercept)  6.100000e+01   1.183216  5.155441e+01 9.547815e-23
dietB        5.000000e+00   1.527525  3.273268e+00 3.802505e-03
dietC        7.000000e+00   1.527525  4.582576e+00 1.805132e-04
dietD       -1.071287e-14   1.449138 -7.392579e-15 1.000000e+00

My code thus far does not look like results:

coagulation$x1 <- 1*(coagulation$diet=="B")
coagulation$x2 <- 1*(coagulation$diet=="C")
coagulation$x3 <- 1*(coagulation$diet=="D")
modelSummary <- lm(coag~1+x1+x2+x3, data=coagulation)

Solution

  • "diet" is a character variable and is treated as a factor. So you may leave out the dummy coding and just do:

    summary(lm(coag ~ diet, data=coagulation))$coefficients
    #                 Estimate Std. Error      t value     Pr(>|t|)
    # (Intercept) 6.100000e+01   1.183216 5.155441e+01 9.547815e-23
    # dietB       5.000000e+00   1.527525 3.273268e+00 3.802505e-03
    # dietC       7.000000e+00   1.527525 4.582576e+00 1.805132e-04
    # dietD       2.991428e-15   1.449138 2.064281e-15 1.000000e+00
    

    Even if "diet" were a numeric variable and you want R to treat it as a categorical rather than a continuous variable no dummy coding is needed, you would just add it as + factor(diet) into the formula.

    As you see, also 1 + is redundant since lm calculates the (Intercept) by default. To omit the intercept, you may do 0 + (or - 1).