rmodelnested

Differences in sum of squares between JMP and R when nested effects in the model


I'm trying to translate a JMP script into R. In my analyses, I have to perform the following linear model : Y ~ G + C + G*C + C%in%L (equivalent model writing in jmp would be G+ C + G * C + L[C]).

I know JMP is giving type 3 SSQ, and I know how to normally get the equivalence in R using proper contrasts. But, I think because I have this nested effect, There is one term that still have a different SSQ in R ans in JMP and I cannot figure out why and how to get the same values. here is some simulation to illustrate the question :

# R code
# library('data.table')
set.seed(18)
TEST=data.table(
 L= c( rep( LETTERS[seq( from = 1, to = 3 )],3), rep( LETTERS[seq( from = 4, to = 6 )],3),rep( LETTERS[seq( from = 7, to = 9 )],3)),
 G= rep(sort(rep(LETTERS[seq( from = 1, to = 3 )],3)),3),
 C=sort(rep(LETTERS[seq( from = 10, to = 12)],9)),
 Y= rnorm(27)
 )
 set.seed(18) 
# part add to have the data not orthogonal
TEST_add=data.table(
 L= sort(rep( LETTERS[seq( from = 10, to = 14 )],2)),
G= rep(LETTERS[seq( from = 1, to = 2 )],5),
C = rep("L",10),
 Y= rnorm(10)
)
TEST=rbind(TEST_add,TEST)
 
 TEST$L = as.factor(TEST$L)
 TEST$G = as.factor(TEST$G)
 TEST$C = as.factor(TEST$C)  
   
model <- lm(Y~ G  +C+ C*G +  C%in% L, data = TEST ,contrasts=list(G="contr.sum",C="contr.sum", L="contr.sum")) 
drop1(model,.~.,test="F",all.cols = FALSE)

This will give me this table in R :

       Df Sum of Sq    RSS    AIC F    value Pr(>F)
<none>                24.196   24.285               
G       2    0.2945   24.491   20.733  0.1035 0.9023
C       2    0.4707   24.667   20.998  0.1653 0.8490
G:C     4    5.9115   30.108   24.373  1.0383 0.4163
C:L    11   21.7349   45.931   26.000  1.3882 0.2631

But JMP will return this table

Source  DF  Sum of Squares
C       2   0.876575
G       2   0.294510
L[C]    11  21.734947
G*C     4   5.911491

All fine but for the term "C" : SSQ is 0.877 in JMP and 0.47 in R

I have tried to change the contrasts but the question does not seem to be here. When I have only fixed effect or simple crossed effect, I have the perfect exact same numbers. I guess there the nested terme is not taken into account the same way in both program ?


Solution

  • My colleague did found me a solution wich I share here in case it can be useful to others. The R package 'sasLM' allows to have the same values that in JMP ( or SAS). '''

    library(sasLM)
    aov3(Formula = Y~ G  +C+ C*G +  C/L, Data=TEST, BETA=FALSE, Resid=FALSE)
    

    does return

        Response : Y
                    Df Sum Sq Mean Sq F value Pr(>F)
    MODEL           19 30.508 1.60568  1.1281 0.4041
     G               2  0.295 0.14725  0.1035 0.9023
     C               2  0.877 0.43829  0.3079 0.7390
     G:C             4  5.911 1.47787  1.0383 0.4163
     C:L            11 21.735 1.97590  1.3882 0.2631
    RESIDUALS       17 24.196 1.42331               
    CORRECTED TOTAL 36 54.704 
    

    Which perfectly match !

    still, I dont know why results are differents, and interested in an answer!