rpurrrstandard-error

Using purrr::map over lmtest::coeftest with the cluster argument calling a variable with the "~" symbol


I am failing to iterate using map (purr package) over the coeftest function (lmtest package) when I specify the argument “cluster.” This argument uses the “~” symbol to call a variable to estimate clustered standard errors. The error I am getting is:

Error in eval(model$call$data, envir) : object '.' not found.

The error seems to be that map is failing to "find" the variable in the map object supplied. Would you mind please briefly explain how to correctly call the variable using the ~ symbol within a purrr::map iteration?

Minimum reproducible example. Here I show how everything works fine without the iteration and how it fails with the map iteration.

data("mtcars")
library(tidyverse)
library(purrr)
library(lmtest)
library(sandwich)

form <- formula(vs ~ hp + cyl)

m1 <- glm(
  formula = form,
  family=binomial(link="logit"),
  data = mtcars)

# all works well here:

coeftest(m1,
         vcov = vcovCL,
         type = "HC0",
         cluster = ~cyl)

# z test of coefficients:
#   
#                 Estimate  Std.Error z value  Pr(>|z|)    
#   (Intercept)  9.0890128  1.7544879  5.1804 2.214e-07 ***
#   hp          -0.0413522  0.0081069 -5.1009 3.381e-07 ***
#   cyl         -0.6895012  0.1632085 -4.2247 2.393e-05 ***
#   ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# failed iteration with purrr::map

data <- mtcars %>% 
  split(.$am)

m2 <- map(
  data,
  ~ glm(
    formula = form,
    family=binomial(link="logit"),
    data = .))

map(m2,
    ~ coeftest(.,
               vcov = vcovCL,
               type = "HC0",
               cluster = ~cyl))

Error in eval(model$call$data, envir) : object '.' not found


Solution

  • If you are looking for ways to make this work, I might suggest using by() from base R. In the by() function, you can subset the data, build a list of arguments, including data, formula and family, and then pass those with do.call() to glm. Then, you can create the coefficient test you want and return the model and test. Here's an example:

    data("mtcars")
    library(lmtest)
    library(sandwich)
    
    form <- formula(vs ~ hp + cyl)
    
    out <- by(mtcars, list(mtcars$am), \(x){
      args <- list(formula = form, 
           family = binomial(link="logit"), 
           data = x)
      m <- do.call(glm, args)
      tst <- coeftest(m, vcov=vcovCL, type="HC0", cluster=~cyl)
      list(model = m, test = tst)
    })
    
    lapply(out, "[", "test")
    #> $`0`
    #> $`0`$test
    #> 
    #> z test of coefficients:
    #> 
    #>                Estimate  Std. Error    z value  Pr(>|z|)    
    #> (Intercept)  1.7018e+02  6.1237e+00     27.791 < 2.2e-16 ***
    #> hp          -1.6003e-02  2.0194e-07 -79247.830 < 2.2e-16 ***
    #> cyl         -2.4043e+01  8.6601e-01    -27.762 < 2.2e-16 ***
    #> ---
    #> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    #> 
    #> 
    #> 
    #> $`1`
    #> $`1`$test
    #> 
    #> z test of coefficients:
    #> 
    #>                Estimate  Std. Error     z value  Pr(>|z|)    
    #> (Intercept)  4.4251e+01  2.4495e+00  1.8066e+01 < 2.2e-16 ***
    #> hp          -2.3612e-02  6.1130e-16 -3.8625e+13 < 2.2e-16 ***
    #> cyl         -1.0070e+01  6.1237e-01 -1.6444e+01 < 2.2e-16 ***
    #> ---
    #> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    

    Created on 2023-09-26 with reprex v2.0.2