I'm trying to make a meansplot with confidence intervals, but I would like the intervals to be Tukey HSD intervals after an ANOVA is computed.
I'll use the next example here to explain, in the dataframe there is a factor: poison {1,2,3}
library(magrittr)
library(ggplot2)
library(ggpubr)
library(dplyr)
library("agricolae")
PATH <- "https://raw.githubusercontent.com/guru99-edu/R-Programming/master/poisons.csv"
df <- read.csv(PATH) %>%
select(-X) %>%
mutate(poison = factor(poison, ordered = TRUE))
glimpse(df)
ggplot(df, aes(x = poison, y = time, fill = poison)) +
geom_boxplot() +
geom_jitter(shape = 15,
color = "steelblue",
position = position_jitter(0.21)) +
theme_classic()
anova_one_way <- aov(time ~ poison, data = df)
summary(anova_one_way)
# Use TukeyHSD
tukeyHSD <- TukeyHSD(anova_one_way)
plot(tukeyHSD)
I would like the plot to be similar to the one from statgraphics, where you can see the mean point and the lenght of the bars is the HSD tuckey intervals, so in one simple glimpse you can apreciate the best level and if it is better and is statistically significantly better.
I have seen some examples in more complex questions but is for boxplots and I dont understand it enough to adapt the solutions here.
Tukey's results on boxplot in R
example1 example1 TukeyHSD results on boxplot after two-way anova example2 example2
The answer provided by Allan Cameron @allan-cameron is great, however It doesnt work right now in my computer probably due to versions. stats_summary method keywords change a bit. I took his solution and did a couple of changes to make it work for me.
# Allans original response
tukeyCI <- (tukeyHSD$poison[1, 1] - tukeyHSD$poison[1, 2]) / 2
# Changed fun.max and min to ymax and ymin
# Changed fun to fun.y to make Allans solution work for me.
ggplot(df, aes(x = poison, y = time)) +
stat_summary(fun.ymax = function(x) mean(x) + tukeyCI,
fun.ymin = function(x) mean(x) - tukeyCI,
geom = 'errorbar', size = 1, color = 'gray50',
width = 0.25) +
stat_summary(fun.y = mean, geom = 'point', size = 4, shape = 21,
fill = 'white') +
geom_point(position = position_jitter(width = 0.25), alpha = 0.4,
color = 'deepskyblue4') +
theme_minimal(base_size = 16)
Error response was:
I'm currently using these versions:
The image from statgraphics shows error bars around the mean points, and if I understand you correctly then you want to be able to draw error bars around your mean points such that non-overlapping error bars mean there are significant differences between the variables. That being the case, we can extract the required confidence interval like this:
tukeyCI <- (tukeyHSD$poison[1,1] - tukeyHSD$poison[1,2])/2
And we can draw the result in ggplot like this:
ggplot(df, aes(x = poison, y = time)) +
stat_summary(fun.max = function(x) mean(x) + tukeyCI,
fun.min = function(x) mean(x) - tukeyCI,
geom = 'errorbar', size = 1, color = 'gray50',
width = 0.25) +
stat_summary(fun = mean, geom = 'point', size = 4, shape = 21,
fill = 'white') +
geom_point(position = position_jitter(width = 0.25), alpha = 0.4,
color = 'deepskyblue4') +
theme_minimal(base_size = 16)
Here we can see that there are significant differences between 1 and 3, and between 2 and 3, but that the difference between 1 and 2 is non-significant.