I'm trying to create a histogram which uses one column in the data set for the fill colour and another column in the data set for the groups. Both of these are define within the aes()
. Then I add a white border which goes around each group. There are more groups than fill categories.
My problem is that when I define groups, the sub-groups within the same fill category are not stacked together in the bars - they seem to be following a random order. I tried ordering the data.frame
by the fill column before passing it to ggplot()
but this doesn't help.
How can I display observations with the same fill category together even when they are in different sub-groups and from different dates (my x axis is dates)?
Here is some example data:
# Set seed for reproducibility:
set.seed(123)
# Set start date:
start_date <- as.Date("2024-01-01")
# Set end date:
end_date <- as.Date("2024-04-01")
# Create data.frame:
data <- data.frame(
onset_date = sample(seq(start_date, end_date, by = "day"),
100,
replace = TRUE),
category = sample(c("A", "B"),
100,
replace = TRUE))
# Add row names for grouping:
data$grouping <- as.numeric(row.names(data))
# Create epicurve_breaks
epicurve_breaks <- seq.Date(
from = start_date,
to = end_date,
by = "week")
Here is the histogram without groups:
p1 <- ggplot(data,
aes(x = onset_date, fill = category)) +
geom_histogram(breaks = epicurve_breaks,
closed = 'left',
colour = "white")
This give the following plot - - as you can see the entities from the same fill category are stacked together:
Here is the code for the plot when I add groups:
p2 <- ggplot(data,
aes(x = onset_date, fill = category, group = grouping)) +
geom_histogram(breaks = epicurve_breaks,
closed = 'left',
colour = "white")
Here is the plot with groupings - now the category A and B squares are no longer clustered together on the bar:
Any advice on how I can keep the categories grouped together even when there are subgroups within categories, would be much appreciated.
Ordering is by factor levels. Using forcats
we can e.g. do:
ggplot(
data,
aes(onset_date, fill = category, group = fct_reorder(factor(grouping), category))
) +
geom_histogram(
breaks = epicurve_breaks,
closed = 'left',
colour = "white"
) +
coord_fixed(6)
(I'm not sure why my data looks different, I used your seed.)