ranova

How is eta squared calculated?


I wonder how eta squared is calculated in this slide. I thought it was the ratio of between group variation and the total sum of squares.

I tried to reproduce their data, and then calculated the eta squared. But I ended up with a different result.

G1 <- c(6.3, 2.8, 7.8, 7.9, 4.9)
G2 <- c(9.9, 4.1, 3.9, 6.3, 6.9)
G3 <- c(5.1, 2.9, 3.6, 5.7, 4.5)
G4 <- c(1.0, 2.8, 4.8, 3.9, 1.6)

n1 <- length(G1); n2 <- length(G2); n3 <- length(G3); n4 <- length(G4)

m1 <- sum(G1)/n1; m2 <- sum(G2)/n2; m3 <- sum(G3)/n3; m4 <- sum(G4)/n4
Gmean <- sum(c(G1,G2,G3,G4))/(n1 + n2 + n3 + n4)

SSb <- n1*((m1-Gmean)^2) + n2*((m2-Gmean)^2) + n3*((m3-Gmean)^2) + n4*((m4-Gmean)^2)

S1 <- var(G1); S2 <- var(G2); S3 <- var(G3); S4 <- var(G4)
SSw <- (n1-1)*S1 + (n2-1)*S2 + (n3-1)*S3 + (n4-1)*S4

SSt <- SSb + SSw

My formula may be wrong.

enter image description here


Solution

  • The slide seems to be wrong. I calculated eta2 using two ways just to make sure I am not missing something and they turn up similar results which are different from your slide. So if I made an error I would have made it in both approaches.

    G1 <- c(6.3, 2.8, 7.8, 7.9, 4.9)
    G2 <- c(9.9, 4.1, 3.9, 6.3, 6.9)
    G3 <- c(5.1, 2.9, 3.6, 5.7, 4.5)
    G4 <- c(1.0, 2.8, 4.8, 3.9, 1.6)
    
    #method 1 (sum of squares as sum of squared deviations):
    n1 <- n2 <- n3 <- n4 <- 5
    
    m1 <- sum(G1) / n1
    m2 <- sum(G2) / n2
    m3 <- sum(G3) / n3
    m4 <- sum(G4) / n4
    
    Gmean <- sum(c(G1, G2, G3, G4)) / (n1 + n2 + n3 + n4)
    
    SSb <- n1 * ((m1 - Gmean)^2) + 
      n2 * ((m2 - Gmean)^2) + 
      n3 * ((m3 - Gmean)^2) + 
      n4 * ((m4 - Gmean)^2)
    
    SSw <- sum((G1 - m1)^2) + sum((G2 - m2)^2) + sum((G3 - m3)^2) + sum((G4 - m4)^2)
    SSt <- SSb + SSw
    eta2 <- SSb / SSt # 0.3935058
    
    #method 2 (sum of squares as differences between sum of squares):
    Y <- sum(c(G1, G2, G3, G4)^2)
    A <- m1^2*n1 + m2^2*n2 + m3^2*n1 + m4^2*n2
    Te <- Gmean^2*sum(n1, n2, n3, n4)
    
    SSW <- Y-A
    SSB <- A-Te
    SST <- Y-Te
    eta2_n <- SSB/SST # 0.3935058
    

    When conducting the anova directly in R I get the same results:

    data <- data.frame(program = rep(c("G1", "G2", "G3", "G4"), each = 5),
                       weight_loss = c(6.3, 2.8, 7.8, 7.9, 4.9,
                                       9.9, 4.1, 3.9, 6.3, 6.9,
                                       5.1, 2.9, 3.6, 5.7, 4.5,
                                       1.0, 2.8, 4.8, 3.9, 1.6))
    
    model <- aov(weight_loss ~ program, data = data)
    summary(model)
    
                Df Sum Sq Mean Sq F value Pr(>F)  
    program      3  37.13  12.375    3.46 0.0414 *
    Residuals   16  57.22   3.576   
    
    summary(model)[[1]][["Sum Sq"]][1]/sum(summary(model)[[1]][["Sum Sq"]])
    # 0.3935058