rplotderivativecoefficientscoefplot

Plot combining regression coefficients (partial derivatives) with CIs in R, lincom + coefplot or plotbeta?


Most of the time we run a regression with interactive terms, we are interested in a partial derivative. For example, consider the model below,

Model specification

If I am interested to know the effect of X1 on P(Y), or the partial derivative of X1 on P(Y), I need the following combination of coefficients:

Partial derivative of X1

Instead of calculating it by hand, I can use, for example, the lincom function in R to calculate linear combination of regression parameters. But I would like not only to know the numbers from calculations like this; I would like to plot them. The problem is, if I am using a R package to plot coefficients (e.g., coefplot) it plots the coefficients from my model, but with no option for linear combination of coefficients. Is there any way to combine the lincom function (or other function that calculates combination of parameter) with coefplot (or other coefficient plot packages with this option)?

Of course, in the example above I only consider the derivative of X1, and if I plot it I will have a plot with one dot and its confidence intervals only, but I would like to show in the plot the coefficients for the partial derivatives of X1, X2, and Z, as in the example below.

Coefficients plot (the one I have): Coefficients Plot

Combination of parameters or partial derivatives plot (the one I am trying to get): Combined Estimates

I discovered that Stata has a function that does what I am looking for, called "plotbeta." Does R have something similar?


Solution

  • Here's a start. This defined a function called plotBeta(), the ... are arguments that get passed down to geom_text() for the estimate text.

    plotBeta <- function(mod, confidence_level = .95, include_est=TRUE, which.terms=NULL, plot=TRUE, ...){
      require(glue)
      require(ggplot2)
      b <- coef(mod)
      mains <- grep("^[^:]*$", names(b), value=TRUE)
      mains.ind <- grep("^[^:]*$", names(b))
      if(!is.null(which.terms)){
        if(!(all(which.terms %in% mains)))stop("Not all terms in which.terms are in the model\n")
        ins <- match(which.terms, mains)
        mains <- mains[ins]
        mains.ind <- mains.ind[ins]
      }
      icept <- grep("Intercept", mains)
      if(length(icept) > 0){
        mains <- mains[-icept]
        mains.ind <- mains.ind[-icept]
      }
      if(inherits(mod, "lm") & !inherits(mod, "glm")){
        crit <- qt(1-(1-confidence_level)/2, mod$df.residual)
      }else{
        crit <- qnorm(1-(1-confidence_level)/2)
      }
      out.df <- NULL
      for(i in 1:length(mains)){
        others <- grep(glue("^{mains[i]}:"), names(b))
        others <- c(others, grep(glue(":{mains[i]}:"), names(b)))
        others <- c(others, grep(glue(":{mains[i]}$"), names(b)))
        all.inds <- c(mains.ind[i], others)
        ones <- rep(1, length(all.inds))
        est <- c(b[all.inds] %*% ones)
        se.est <- sqrt(c(ones %*% vcov(mod)[all.inds, all.inds] %*% ones))
        lower <- est - crit*se.est
        upper <- est + crit*se.est
        tmp <- data.frame(var = mains[i],  
                          lab = glue("dy/d{mains[i]} = {paste('B', all.inds, sep='', collapse=' + ')}"), 
                          labfac = i, 
                          est = est, 
                          se.est = se.est, 
                          lower = lower, 
                          upper=upper)
        tmp$est_text <- sprintf("%.2f (%.2f, %.2f)", tmp$est, tmp$lower, tmp$upper)
        out.df <- rbind(out.df, tmp)
      }
      out.df$labfac <- factor(out.df$labfac, labels=out.df$lab)
      if(!plot){
        return(out.df)
      }else{
        g <- ggplot(out.df, aes(x=est, y=labfac, xmin=lower, xmax=upper)) + 
          geom_vline(xintercept=0, lty=2, size=.25, col="gray50") + 
          geom_errorbarh(height=0) + 
          geom_point() + 
          ylab("") + xlab("Estimates Combined") + 
          theme_classic() 
        if(include_est){
          g <- g + geom_text(aes(label=est_text), vjust=0, ...)
        }
        g
      }
    }
    

    Here's an example with some made-up data:

    set.seed(2101)
    dat <- data.frame(
      X1 = rnorm(500), 
      X2 = rnorm(500), 
      Z = rnorm(500), 
      W = rnorm(500)
    )
    dat <- dat %>% 
      mutate(yhat = X1 - X2 + X1*X2 - X1*Z + .5*X2*Z - .75*X1*X2*Z + W, 
             y = yhat + rnorm(500, 0, 1.5))
    
    mod <- lm(y ~ X1*X2*Z + W, data=dat)
    plotBeta(mod, position=position_nudge(y=.1), size=3) + xlim(-2.5,2)
    

    enter image description here

    EDIT: comparing two models

    Using the newly-added plot=FALSE, we can generate the data and then combine and plot.

    mod <- lm(y ~ X1*X2*Z + W, data=dat)
    p1 <- plotBeta(mod, plot=FALSE)
    mod2 <- lm(y ~ X1*X2 + Z + W, data=dat)
    p2 <- plotBeta(mod2, plot=FALSE)
    p1 <- p1 %>% mutate(model = factor(1, levels=1:2, 
                                       labels=c("Model 1", "Model 2")))
    p2 <- p2 %>% mutate(model = factor(2, levels=1:2, 
                                       labels=c("Model 1", "Model 2")))
    
    p_both <- bind_rows(p1, p2)
    p_both <- p_both %>% 
      arrange(var, model) %>% 
      mutate(labfac = factor(1:n(), labels=paste("dy/d", var, sep="")))
    
    ggplot(p_both, aes(x=est, y=labfac, xmin=lower, xmax=upper)) + 
      geom_vline(xintercept=0, lty=2, size=.25, col="gray50") + 
      geom_linerange(position=position_nudge(y=c(-.1, .1))) + 
      geom_point(aes(shape=model), 
                 position=position_nudge(y=c(-.1, .1))) + 
      geom_text(aes(label=est_text), vjust=0,
                position=position_nudge(y=c(-.2, .15))) + 
      scale_shape_manual(values=c(1,16)) + 
      ylab("") + xlab("Estimates Combined") + 
      theme_classic() 
    
    

    enter image description here