I have a barchart and I want to order the bars and calculate percentage of changes from one bar to another and show on the plot.
I used factors and tried different methods and everything on stackoverflow
, but it gives me all sorts of errors. I managed to order the legend in terms of colors but the bar colours does not have the same order.
and here is my code:
library(ggplot2)
data$Active_period <- factor(data$Active_period,levels = c("study_year","last_year", "this_year"))
fct_rev(data$Active_period)
ggplot(data, aes(count, Name, fill = Active_period)) +
geom_col(width = 0.8, position = position_dodge2(width = 0.8, preserve = "single"))+
facet_grid(Time ~ site)+
scale_fill_manual(name = " ",values = c("study_year" ="#5d5c5c", "this_year" ="#4475aa",
"last_year" = "#333239"),limits = c("study_year","last_year", "this_year"))+
theme(legend.position = "bottom",
axis.text = element_text(size = 11),
axis.title = element_text(size = 11, face = "bold"),
strip.text.x = element_text(size = 12),
strip.text.y = element_text(size = 12),
plot.title = element_text(size = 14, face = "bold"),
plot.subtitle = element_text(size = 13)))
Here is my data:
Name | Active_period| site |Time | count
A | Last_year| north| mornings |10
A | Last_year| south| mornings |20
A | Last_year| north| evenings|45
A | Last_year| south| evenings|35
A | this_year| north| mornings |80
A | this_year| south| mornings |60
A | this_year| north| evenings|95
A | this_year| south| evenings|120
A | study_year| north| mornings |100
A | study_year| south| mornings |400
A | study_year| north| evenings|220
A | study_year| south| evenings|32
B | Last_year| north| mornings |10
B | Last_year| south| mornings |45
B | Last_year| north| evenings|25
B | Last_year| south| evenings|20
B | this_year| north| mornings |300
B | this_year| south| mornings |250
B | this_year| north| evenings|140
B | this_year| south| evenings|20
B | study_year| north| mornings |10
B | study_year| south| mornings |20
B | study_year| north| evenings|10
B | study_year| south| evenings|20
Here is a solution.
First, I create a custom theme so that the problem code is simpler. This is not really part of the question and it can be defined before the answer.
theme_so_q77804382 <- function(){
theme_grey(base_size = 10) %+replace%
theme(
legend.position = "bottom",
axis.text = element_text(size = 11),
axis.title = element_text(size = 11, face = "bold"),
strip.text.x = element_text(size = 12),
strip.text.y = element_text(size = 12, angle = -90),
plot.title = element_text(size = 14, face = "bold"),
plot.subtitle = element_text(size = 13)
)
}
The plot is a horizontal bars plot of counts by name so instead of defining the y axis as the names axis, the recommended way is to define the independent variable in the x axis as usual and then reverse with coord_flip
.
Now the problem.
Active_period
to factor;Name
, site
and Time
;study_year
, reverse the factor levels to have the plot as asked.Then just plot it, with the percentages plotted with geom_text
. The text must also be grouped and since fill
is not a geom_text
aesthetic, use group = Active_period
. Play a little with hjust
in order to have the minus signs visible whenever the changes are negative.
suppressPackageStartupMessages({
library(dplyr)
library(ggplot2)
})
data$Active_period[data$Active_period == "Last_year"] <- "last_year"
data %>%
mutate(Active_period = factor(Active_period,levels = c("study_year","last_year", "this_year"))) %>%
group_by(Name, site, Time) %>%
arrange(Active_period) %>%
mutate(perc = scales::percent(c(0, diff(count))/dplyr::lag(count, default = 1))) %>%
ungroup() %>%
mutate(Active_period = forcats::fct_rev(Active_period)) %>%
ggplot(aes(Name, count, fill = Active_period)) +
geom_col(
width = 0.8,
position = position_dodge2(width = 0.8, preserve = "single")
) +
geom_text(
position = position_dodge2(width = 0.8, preserve = "single"),
aes(label = perc, group = Active_period),
hjust = -0.2
) +
scale_fill_manual(
name = " ",
values = c(study_year ="#5d5c5c", this_year = "#4475aa", last_year = "#333239"),
limits = c("study_year","last_year", "this_year")
) +
coord_flip() +
facet_grid(Time ~ site) +
theme_so_q77804382()
Created on 2024-01-12 with reprex v2.0.2
In order to remove the "0%"
from the first bars, change the dplyr pipe to
data %>%
mutate(Active_period = factor(Active_period,levels = c("study_year","last_year", "this_year"))) %>%
group_by(Name, site, Time) %>%
arrange(Active_period) %>%
mutate(
perc = scales::percent(c(0, diff(count))/dplyr::lag(count, default = 1)),
perc = c("", perc[-1L])
) %>%
ungroup() %>%
mutate(Active_period = forcats::fct_rev(Active_period)) %>%
The change that removes the number is in the 2nd mutate
.