I would like to test if the coefficients across different regressions are the same in R and I am not sure how I should do it.
Let me illustrate. I have an independent variable of interest called treat
I want to see if the treatment has the same effect on two different outcomes y1 and y2.
How can I do it?
Assume I have a dataset that looks like this.
# Set seed for reproducibility
set.seed(123)
# Create example dataset
n <- 500 # Number of observations
groups <- 20 # Number of groups
years <- 10 # Number of years
# Simulate group and year identifiers
group <- rep(1:groups, each = n / groups)
yr <- rep(1:years, length.out = n)
# Simulate treatment assignment
treat <- sample(c(0, 1), size = n, replace = TRUE)
# Create covariates and treatment effects
X <- rnorm(n) # Random covariate
epsilon1 <- rnorm(n, sd = 2) # Error term for Y1
epsilon2 <- rnorm(n, sd = 2) # Error term for Y2
# Generate dependent variables
Y1 <- 5 + 2 * treat + 0.5 * X + epsilon1 # Y1: treatment effect = 2
Y2 <- 5 + 1 * treat + 0.5 * X + epsilon2 # Y2: treatment effect = 1
# Combine into a data frame
dt <- data.frame(group, yr, treat, X, Y1, Y2)
My objective is to evaluate if treat
has a significantly different effect on Y1
relative to Y2
.
I hence estimate my two models. In this case I am using two way fixed effects.
# Estimate the model for Y1
twfe_Y1 <- feols(Y1 ~ i(treat, ref = FALSE) | yr + group, data = dt, vcov = ~group)
# Estimate the model for Y2
twfe_Y2 <- feols(Y2 ~ i(treat, ref = FALSE) | yr + group, data = dt, vcov = ~group)
I can then compare models using etable
etable(twfe_Y1,twfe_Y2)
twfe_Y1 twfe_Y2
Dependent Var.: Y1 Y2
treat = 1 2.079*** (0.2368) 1.062*** (0.1361)
Fixed-Effects: ----------------- -----------------
yr Yes Yes
group Yes Yes
_______________ _________________ _________________
S.E.: Clustered by: group by: group
Observations 500 500
R2 0.22430 0.11058
Within R2 0.19325 0.05948
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How can I test if the coefficient of treat on Y1 is significantly different from the coefficient on Y2?
Note that I cannot stack the Ys and estimate a model with the interaction between treatment and Y group, because in the real analysis I am using some of the new Difference in differences packages for the estimation process and it is not entirely clear how they can include interaction with the treatment.
So I need a method that ex post takes coefficients & standard errors and test for equality of coefficient.
Any guidance on how to implement this would be great.
As Vincent suggested, github.com/kylebutts/vcovSUR should give you exactly what you need. Here is how to do it with your example
# install.packages("devtools")
devtools::install_github("kylebutts/vcovSUR")
library(fixest)
library(vcovSUR)
sur_hypotheses(
list(twfe_Y1, twfe_Y2), cluster = "rowid",
hypothesis = "`treat::1_1` = `treat::1_2`" # mind the quotes!
)
Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
1.02 0.261 3.9 <0.001 13.3 0.506 1.53
I was also reading a very similar topic here https://stats.stackexchange.com/questions/93540/testing-equality-of-coefficients-from-two-different-regressions, and it was suggested to use a Z statistic, which I think can be calculated as follows
beta1 <- as.numeric(twfe_Y1$coefficients)
se1<- as.numeric(twfe_Y1$se)
beta2 <- as.numeric(twfe_Y2$coefficients)
se2<- as.numeric(twfe_Y2$se)
Z <- (beta1 - beta2)/ sqrt(se1^2 + se2^2)
I did not dive deep into it, but I noted that, interestingly, these two methods return very similar (but not identical) p-values
pnorm(Z, lower.tail = F)
9.784923e-05
sur_hypotheses(
list(twfe_Y1, twfe_Y2), cluster = "rowid",
hypothesis = "`treat::1_1` = `treat::1_2`"
)$p.value
9.743827e-05