Boxplot with Tukey Comparison Bars
I had run an ANOVA test and found that at least one of the means among my groups were statistically significantly different. As follow-up analysis, I ran a Tukey test and found which groups differed in means, and I'm hoping to illustrate this in a boxplot as presented in this paper, Effects of intermittent Pringle's manoeuvre on cirrhotic compared with normal liver:
https://academic.oup.com/bjs/article/97/7/1062/6150536?login=true
I can generate the boxplot, but I want to add bar(s) that illustrates which groups significantly differ in mean with the asterik, as highlighted in the image. Anyone know how I could approach this, ideally in SAS or R?
In SAS, I've used PROC SGPLOT to generate the boxplot, and in R, I know I can use geom_boxplot, but as for any additional annotations, I'm not sure what options are available to accomplish this.
In Base R you can generate these lines manually if you want more granular control, or for a quick and seemingly automated approach use ggstatsplot::ggbetweenstats
(among other approaches, I'm sure):
Data
df <- data.frame(DAST = 1:300,
Category = rep(c("Normal", "Chronic Hepatitis", "Liver Cirrhosis"), each = 100))
ggbetweenstats
approachsee ?ggstatsplot::ggbetweenstats
for a wide range of options on how to customize
library(ggplot2)
library(ggstatsplot)
ggstatsplot::ggbetweenstats(df, x = Category, y = DAST)
Colored lines for clarity
# vertical spacing between bars
v_spacing <- c(max(df$DAST) + seq(20, 50, length.out = 3))
plot(x = as.factor(df$Category), y = df$DAST,
xlab = NA, ylab = "D-AST", frame = FALSE)
# horizontal lines - position 1 = Chronic Hepatitis, 2 = Liver Cirrhosis, 3 = Normal
# bars map between positions 1-2, 1-3, 2-3
segments(x0 = c(1,1,2),
x1 = c(2,3,3),
y0 = v_spacing,
xpd = TRUE,
col = c("red", "green", "blue"))
# vertical lines
segments(x0 = c(1, 2, 1, 3, 2, 3),
y0 = rep(v_spacing, each = 2),
y1 = rep(v_spacing, each = 2) - 5,
xpd = TRUE,
col = rep(c("red", "green", "blue"), each = 2))
# Denote significance
text("*",
x = c(mean(1:2), mean(c(1,3)), mean(2:3)),
y = v_spacing + 5,
xpd = TRUE,
col = c("red", "green", "blue"))