rlinear-regressiondata-manipulationstandard-errorrobust

Fit models with robust standard errors


I am using the following R code to run several linear regression models and extract results to dataframe:

library(tidyverse)
library(broom)

data <- mtcars
outcomes <- c("wt", "mpg", "hp", "disp")
exposures <- c("gear", "vs", "am")

models <- expand.grid(outcomes, exposures) %>%
group_by(Var1) %>% rowwise() %>%
summarise(frm = paste0(Var1, "~factor(", Var2, ")")) %>%
group_by(model_id = row_number(),frm) %>%
do(tidy(lm(.$frm, data = data))) %>%
mutate(lci = estimate-(1.96*std.error),
     uci = estimate+(1.96*std.error))

How can I modify my code to use robust standard errors similar to STATA?

* example of using robust standard errors in STATA
regress y x, robust

Solution

  • There is a comprehensive discussion about the robust standard errors in lm models at stackexchange.

    You can update your code in the following way:

    library(sandwich)
    
    models <- expand.grid(outcomes, exposures) %>%
     group_by(Var1) %>% rowwise() %>%
     summarise(frm = paste0(Var1, "~factor(", Var2, ")")) %>%
     group_by(model_id = row_number(),frm) %>%
     do(cbind(
      tidy(lm(.$frm, data = data)),
      robSE = sqrt(diag(vcovHC(lm(.$frm, data = data), type="HC1"))) )
     ) %>%
     mutate(
      lci  = estimate - (1.96 * std.error), 
      uci  = estimate + (1.96 * std.error),
      lciR = estimate - (1.96 * robSE),
      uciR = estimate + (1.96 * robSE)
     )
    

    The important line is this:

    sqrt(diag(vcovHC(lm(.$frm, data = data), type="HC1"))) )
    

    Function vcovHC returns covariance matrix. You need to extract variances on the diagonal diag and take compute a square root sqrt.