rggplot2r-forestplot

how to add tables to a forest plot


I have the following dataframe df:

df <- data.frame(
  beta = c(0.45, -0.12, 0.33, -0.07, 0.21, 0.65, -0.18, 0.09),
  se = c(0.05, 0.03, 0.04, 0.02, 0.06, 0.07, 0.03, 0.05),
  prs_trait1 = c(
    "Rose Growth Index",
    "Tulip Petal Width",
    "Sunflower Height",
    "Daisy Bloom Count",
    "Orchid Stem Thickness",
    "Lily Leaf Area",
    "Carnation Bud Count",
    "Violet Flower Density"
  ),
  OR_CI_text = c(
    "1.57(1.46-1.70)",
    "0.89(0.84-0.94)",
    "1.39(1.28-1.52)",
    "0.93(0.90-0.96)",
    "1.23(1.15-1.32)",
    "1.92(1.72-2.15)",
    "0.83(0.79-0.87)",
    "1.10(1.02-1.18)"
  ),
  N = c(1200, 950, 1350, 890, 1020, 1600, 800, 1150) # Number of observations
)

I wrote the code for this forest plot:

# Create the forest plot
 ggforestplot::forestplot(
  df = df,
  name = trait,
  estimate = beta,
  se = SE,
  xlab = "Odds ratio (95% CI)",
  title = NULL,
  grid = FALSE
)

Now, I would like to add OR_CI_text on the right of this forest plot, while the trait and the N on the left like in the attached figure:

enter image description here

How can I adjust my R code? Thanks!


Solution

  • Although this proposed solution is base R and not ggplot2 or forestplot, I first learned on base R plotting so personally more familiar with base's control over the exact placement of everything - ggplot also is great and gives plenty control as well, so I'm sure there is a way using these external packages.

    First, while you can transform your beta and SE values to the OR and confidence intervals, I'm just going to extract them from the text values in df$OR_CI_text and keep them in a reference matrix called plotvals:

    # extract point and lo/hi values from text
    plotvals <- do.call(rbind, 
                        lapply(regmatches(df$OR_CI_text, gregexpr("([0-9.]+)", df$OR_CI_text)), 
                               as.numeric))
    
    #      [,1] [,2] [,3]
    # [1,] 1.57 1.46 1.70
    # [2,] 0.89 0.84 0.94
    # [3,] 1.39 1.28 1.52
    # [4,] 0.93 0.90 0.96
    # [5,] 1.23 1.15 1.32
    # [6,] 1.92 1.72 2.15
    # [7,] 0.83 0.79 0.87
    # [8,] 1.10 1.02 1.18
    

    This gives us the raw values we want to plot.

    Now we can build the plot manually. If you're unfamiliar with base R plotting, I would suggest running this line-by-line to see what each command is adding to the plot as we build it:

    # Set some global paramters
    seqx <- 2^c(-3:3) # for OR axis labels
    xx <- range(0.0005, 1e2) # overall x-axis size
    yy <- seq_len(nrow(df)) # placement of graphics and text on y axis
    
    # Initiate blank plot
    plot(x = log(xx),
         y = c(0.9, nrow(df)+0.1),
         type = "n", axes = FALSE,
         xlab = NA, ylab = NA)
    
    # Add in axes and labels
    axis(1, at = log(seqx), labels = seqx)
    axis(3, at = log(seqx), labels = seqx)
    mtext(side = 1, "Odds Ratio", at = 0, padj = 4)
    mtext(side = 3, "Odds Ratio", at = 0, padj = -4)
    
    # Add in colored bars behind everything else
    rect(xleft = log(xx[1]), xright = log(xx[2]),
         ybottom = yy[c(TRUE, FALSE)]-0.5, ytop = yy[c(FALSE, TRUE)]-0.5,
         col = "lightgrey", border = NA)
    
    # Add reference line
    abline(v = log(1), lty = 3, lwd = 0.75)
    
    # Add in OR and CI points and lines
    segments(x0 = log(plotvals[,2]),
          x1 = log(plotvals[,3]),
          y0 = yy)
    points(x = log(plotvals[,1]), 
           y = yy, pch = 22, bg = "maroon")
    
    # Add in text
    text(df$prs_trait1, x = log(2^-5), y = yy, pos = 2, xpd = TRUE) # covariate
    text(df$N, x = log(2^-4), y = yy, pos = 2, xpd = TRUE) # n
    text(df$OR_CI_text, x = log(2^6), y = yy, pos = 2) # OR (CI)
    text(c("Covariate", "n", "OR (95% CI)"), # top labels
         x = log(2 ^ c(-5, -4, 6)),
         y = max(yy) + 1,
         xpd = TRUE, pos = 2)
    

    The final plot looks like this:

    enter image description here

    This is a relative basic plot - of course you can play around with the different parameters to fine tune it to exactly what you'd want it to look like. Good luck!


    Personal note: if the lack of space between the point estimate and parentheses of the CI is bugging you like it is me, you can replace the relevant line of text with:

    text(gsub("\\(", " \\(", df$OR_CI_text), x = log(2^6), y = yy, pos = 2) # OR (CI)