I'm trying to translate a JMP script into R. In my analyses, I have to perform the following linear model :
Y ~ G + C + G*C + C%in%L
(equivalent model writing in jmp would be G+ C + G * C + L[C]
).
I know JMP is giving type 3 SSQ, and I know how to normally get the equivalence in R using proper contrasts. But, I think because I have this nested effect, There is one term that still have a different SSQ in R ans in JMP and I cannot figure out why and how to get the same values. here is some simulation to illustrate the question :
# R code
# library('data.table')
set.seed(18)
TEST=data.table(
L= c( rep( LETTERS[seq( from = 1, to = 3 )],3), rep( LETTERS[seq( from = 4, to = 6 )],3),rep( LETTERS[seq( from = 7, to = 9 )],3)),
G= rep(sort(rep(LETTERS[seq( from = 1, to = 3 )],3)),3),
C=sort(rep(LETTERS[seq( from = 10, to = 12)],9)),
Y= rnorm(27)
)
set.seed(18)
# part add to have the data not orthogonal
TEST_add=data.table(
L= sort(rep( LETTERS[seq( from = 10, to = 14 )],2)),
G= rep(LETTERS[seq( from = 1, to = 2 )],5),
C = rep("L",10),
Y= rnorm(10)
)
TEST=rbind(TEST_add,TEST)
TEST$L = as.factor(TEST$L)
TEST$G = as.factor(TEST$G)
TEST$C = as.factor(TEST$C)
model <- lm(Y~ G +C+ C*G + C%in% L, data = TEST ,contrasts=list(G="contr.sum",C="contr.sum", L="contr.sum"))
drop1(model,.~.,test="F",all.cols = FALSE)
This will give me this table in R :
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 24.196 24.285
G 2 0.2945 24.491 20.733 0.1035 0.9023
C 2 0.4707 24.667 20.998 0.1653 0.8490
G:C 4 5.9115 30.108 24.373 1.0383 0.4163
C:L 11 21.7349 45.931 26.000 1.3882 0.2631
But JMP will return this table
Source DF Sum of Squares
C 2 0.876575
G 2 0.294510
L[C] 11 21.734947
G*C 4 5.911491
All fine but for the term "C" : SSQ is 0.877 in JMP and 0.47 in R
I have tried to change the contrasts but the question does not seem to be here. When I have only fixed effect or simple crossed effect, I have the perfect exact same numbers. I guess there the nested terme is not taken into account the same way in both program ?
My colleague did found me a solution wich I share here in case it can be useful to others. The R package 'sasLM' allows to have the same values that in JMP ( or SAS). '''
library(sasLM)
aov3(Formula = Y~ G +C+ C*G + C/L, Data=TEST, BETA=FALSE, Resid=FALSE)
does return
Response : Y
Df Sum Sq Mean Sq F value Pr(>F)
MODEL 19 30.508 1.60568 1.1281 0.4041
G 2 0.295 0.14725 0.1035 0.9023
C 2 0.877 0.43829 0.3079 0.7390
G:C 4 5.911 1.47787 1.0383 0.4163
C:L 11 21.735 1.97590 1.3882 0.2631
RESIDUALS 17 24.196 1.42331
CORRECTED TOTAL 36 54.704
Which perfectly match !
still, I dont know why results are differents, and interested in an answer!