rmeancausality

Estimate significance of double difference in means


this may be a trivial question.

In my data, I have two groups grp1 and grp2. In each group, I have some observations assigned to the treatment group and some observations assigned to the control group.

My question is whether there is a statistically significant difference on dv of the treatment in grp1 and grp2. In some way, this is a difference in differences. I want to estimate if the following difference is significant:

dd = mean(dv_grp1_treat-dv_grp1_control)-mean(dv_grp2_treat-dv_grp2_control)

# create data
install.packages("librarian")
librarian::shelf(librarian,tidyverse,truncnorm)
aud_tr<- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2.1, sd=1))) %>% mutate(group="grp1_tr")
aud_notr <- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2, sd=1))) %>% mutate(group="grp1_notr")
noaud_tr<- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2.4, sd=1))) %>% mutate(group="grp2_tr")
noaud_notr<- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2.1, sd=1))) %>% mutate(group="grp2_notr")
df<- bind_rows(aud_tr,aud_notr,noaud_tr,noaud_notr)
unique(df$group)
[1] "grp1_treat"   "grp1_control" "grp2_treat"   "grp2_control"

I know how to run t.test for difference in means between in each group, but how do I do it if I want to examine the difference across groups?

t.test(df$dv[df$group=="grp1_treat"],df$dv[df$group=="grp1_control"])
t.test(df$dv[df$group=="grp2_treat"],df$dv[df$group=="grp2_control"])

Solution

  • It sounds like you need a two-way analysis of variance (ANOVA). Firstly, you should ensure that you separate out "group membership" and "treatment versus control" into two columns, since these are really two distinct variables:

    df$treatment <- ifelse(grepl('treat', df$group), 'treat', 'control')
    df$group     <- ifelse(grepl('1', df$group), 'grp1', 'grp2')
    

    Then you can carry out a two way ANOVA using aov

    summary(aov(dv ~ group + treatment, data = df))
    #>              Df Sum Sq Mean Sq F value   Pr(>F)    
    #> group         1   1.18   1.175   1.362    0.245    
    #> treatment     1  26.14  26.145  30.307 1.14e-07 ***
    #> Residuals   197 169.95   0.863                     
    #> ---
    #> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    

    This tells you that, in this sample, the effect of treatment was significant, but the effect of group membership was not


    Data

    Obviously, we don't have your data since it wasn't supplied in the question, but the following sample data frame has the same names and structure as your own:

    set.seed(1)
    
    df <- data.frame(dv = c(rnorm(50, 3.2), rnorm(50, 3.8), 
                            rnorm(50, 3.5), rnorm(50, 4.1)),
                     group = rep(c('grp1_control', 'grp1_treat', 
                                   'grp2_control', 'grp2_treat'), each = 50))