requalitycoefficients

Compare coefficients across different regressions in R


I would like to test if the coefficients across different regressions are the same in R and I am not sure how I should do it.

Let me illustrate. I have an independent variable of interest called treat I want to see if the treatment has the same effect on two different outcomes y1 and y2. How can I do it?

Assume I have a dataset that looks like this.

# Set seed for reproducibility
set.seed(123)

# Create example dataset
n <- 500  # Number of observations
groups <- 20  # Number of groups
years <- 10  # Number of years

# Simulate group and year identifiers
group <- rep(1:groups, each = n / groups)
yr <- rep(1:years, length.out = n)

# Simulate treatment assignment
treat <- sample(c(0, 1), size = n, replace = TRUE)

# Create covariates and treatment effects
X <- rnorm(n)  # Random covariate
epsilon1 <- rnorm(n, sd = 2)  # Error term for Y1
epsilon2 <- rnorm(n, sd = 2)  # Error term for Y2

# Generate dependent variables
Y1 <- 5 + 2 * treat + 0.5 * X + epsilon1  # Y1: treatment effect = 2
Y2 <- 5 + 1 * treat + 0.5 * X + epsilon2  # Y2: treatment effect = 1

# Combine into a data frame
dt <- data.frame(group, yr, treat, X, Y1, Y2)

My objective is to evaluate if treat has a significantly different effect on Y1 relative to Y2. I hence estimate my two models. In this case I am using two way fixed effects.

# Estimate the model for Y1
twfe_Y1 <- feols(Y1 ~ i(treat, ref = FALSE) | yr + group, data = dt, vcov = ~group)

# Estimate the model for Y2
twfe_Y2 <- feols(Y2 ~ i(treat, ref = FALSE) | yr + group, data = dt, vcov = ~group)

I can then compare models using etable

etable(twfe_Y1,twfe_Y2)
                  twfe_Y1           twfe_Y2
Dependent Var.:                Y1                Y2
                                                   
treat = 1       2.079*** (0.2368) 1.062*** (0.1361)
Fixed-Effects:  ----------------- -----------------
yr                            Yes               Yes
group                         Yes               Yes
_______________ _________________ _________________
S.E.: Clustered         by: group         by: group
Observations                  500               500
R2                        0.22430           0.11058
Within R2                 0.19325           0.05948
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

How can I test if the coefficient of treat on Y1 is significantly different from the coefficient on Y2?

Note that I cannot stack the Ys and estimate a model with the interaction between treatment and Y group, because in the real analysis I am using some of the new Difference in differences packages for the estimation process and it is not entirely clear how they can include interaction with the treatment.

So I need a method that ex post takes coefficients & standard errors and test for equality of coefficient.

Any guidance on how to implement this would be great.


Solution

  • As Vincent suggested, github.com/kylebutts/vcovSUR should give you exactly what you need. Here is how to do it with your example

    # install.packages("devtools")
    devtools::install_github("kylebutts/vcovSUR")
    library(fixest)
    library(vcovSUR)
    
    sur_hypotheses(
      list(twfe_Y1, twfe_Y2), cluster = "rowid",
      hypothesis = "`treat::1_1`  = `treat::1_2`" # mind the quotes!
    )
    
     Estimate Std. Error   z Pr(>|z|)    S 2.5 % 97.5 %
         1.02      0.261 3.9   <0.001 13.3 0.506   1.53
    

    I was also reading a very similar topic here https://stats.stackexchange.com/questions/93540/testing-equality-of-coefficients-from-two-different-regressions, and it was suggested to use a Z statistic, which I think can be calculated as follows

    beta1 <- as.numeric(twfe_Y1$coefficients)
    se1<- as.numeric(twfe_Y1$se)
    beta2 <- as.numeric(twfe_Y2$coefficients)
    se2<- as.numeric(twfe_Y2$se)
    
    Z <- (beta1 - beta2)/ sqrt(se1^2 + se2^2)
    

    I did not dive deep into it, but I noted that, interestingly, these two methods return very similar (but not identical) p-values

    pnorm(Z, lower.tail = F)
    
    9.784923e-05
    
    sur_hypotheses(
         list(twfe_Y1, twfe_Y2), cluster = "rowid",
         hypothesis = "`treat::1_1`  = `treat::1_2`" 
       )$p.value
    
    9.743827e-05