I have the following CSV file:
group;response
stepOne;107
stepOne;946
stepTwo;184
stepTwo;456
...
I am reading it into dataframe, calculate the stats and reflect that stats on graph as errorbarsh. The code is the following
# Load required libraries
library(ggplot2)
library(dplyr)
library(viridis)
# Read the input CSV file
data <- read.csv("~/Documents/r/input_file.csv", sep=";", header=TRUE)
x_axis_name <- names(data)[2]
y_axis_name <- names(data)[1]
title <- sprintf("Confidence interval for %s", names(data)[1])
# Calculate mean and confidence intervals for each group
group_stats <- data %>%
group_by(group) %>%
summarise(
mean = mean(response),
lower_ci_90 = mean - qt(0.90, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_90 = mean + qt(0.90, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_95 = mean - qt(0.95, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_95 = mean + qt(0.95, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_99 = mean - qt(0.99, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_99 = mean + qt(0.99, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_999 = mean - qt(0.999, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_999 = mean + qt(0.999, df = n() - 1) * (sd(response) / sqrt(n()-1))
) %>%
arrange(desc(group))
# Find absolute maximum from group_stats
limit_x_left = 0
limit_x_right <- 1.15*max(unlist(group_stats[, sapply(group_stats, is.numeric)]))
# Generate the graph
graph <- ggplot(group_stats, aes(y = group[order(group, decreasing=TRUE)], x = mean, color = group)) +
geom_errorbarh(aes(xmin = lower_ci_90, xmax = upper_ci_90), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_errorbarh(aes(xmin = lower_ci_95, xmax = upper_ci_95), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_errorbarh(aes(xmin = lower_ci_99, xmax = upper_ci_99), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_point(size = 3) +
labs(x = x_axis_name, y = y_axis_name, title = title, color = "Legend") +
xlim(limit_x_left, limit_x_right) +
scale_color_manual(values = group_colors)
# Display the graph
print(graph)
No matter how I initially arrange group_stats dataframe, ggplot will put bars ordering them alphabetically DESC.
Trying to reorder it inside ggplot leads to nothing: y = group[order(group, decreasing=FALSE)] or y = group[order(group, decreasing=TRUE)] is being ignored.
However if I put y = group[order(mean, decreasing=TRUE)] the values are being sorted by mean perfectly.
So, is there any way to change order of errorbars are being put onto plot? I want to order them alphabetically ASC.
Per @jared_mamrot, y = fct_rev(group) instead of y = group[order(group, decreasing=TRUE)] solves the issue. Need to install forcats:
install.packages("forcats")
(or install.packages("tidyverse")
)
and add
library(forcats)
to the code.
Thanks @jared_mamrot