I have a dataset which is a list of single cases with a date of occurrence (Date.of.admission
).
I would like to plot cases per week (using a histogram) using the week start date as x axis labels (ie, not week number). I have looked at other examples of this using floor_date()
and other functions but I haven't been able to get it to work for me.
I have changed the date format of Date.of.admission
:
library(dplyr)
library(lubridate)
library(ggplot2)
class(df1$Date.of.admission)
ymd("2025-03-05")
df1 <- df1 %>%
mutate(Date.of.admission = lubridate::ymd(Date.of.admission))
class(df1$Date.of.admission)
Then I set a start and end date:
start_date <- ymd("2023-12-11")
end_date <- ymd("2025-03-31")
Then I plotted the cases:
ggplot(df1)+
geom_histogram(aes(Date.of.admission), binwidth = 2, color = "black", fill= "black")+
labs(x = "Admission date",
y= "Weekly cases")+
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
scale_x_date(labels=date_format("%d %b %y"), date_breaks="30 days", limits=c(start_date, end_date))
However instead of daily cases as this plot shows, I would like them to appear as cases per week with the start date of the week on the x axis.
Many thanks!
Data:
structure(list(Case.no. = 1:65, case.likelihood = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Date.of.admission = c("23/08/2024",
"3/09/2024", "15/09/2024", "11/09/2024", "3/01/2024", "18/01/2024",
"5/02/2024", "15/02/2024", "22/02/2024", "24/03/2024", "26/04/2024",
"18/05/2024", "20/05/2024", "27/05/2024", "18/06/2024", "11/07/2024",
"21/07/2024", "2/08/2024", "5/09/2024", "5/09/2024", "4/04/2024",
"7/02/2024", "7/12/2024", "9/12/2024", "10/08/2024", "11/03/2024",
"12/03/2024", "13/06/2024", "14/12/2023", "18/07/2024", "25/03/2024",
"27/01/2024", "27/02/2024", NA, "16/09/2024", "15/09/2024", "17/09/2024",
"23/10/2024", "16/10/2024", "16/10/2024", "27/11/2024", "7/11/2024",
"7/11/2024", "12/10/2024", "8/10/2024", "5/11/2024", "25/10/2024",
"24/10/2024", "30/11/2024", NA, "3/01/2025", "6/01/2025", "6/01/2025",
"8/01/2025", "8/01/2025", "22/12/2024", "13/01/2025", "4/01/2025",
"26/12/2024", "6/01/2025", "18/01/2025", "23/01/2025", NA, "15/03/2025",
"5/03/2025"), Zoo.ID.or.name = c("M2409108", NA, "M2409115 'Heritage'",
"M2409114 'Sloane'", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "M2409117", "M2409116", "M2409113 'Chicago'",
"20241023SM\n", NA, NA, "M2410126", "M2412133 'Chloe2'", "Chloe3'",
"Scout", "Alana", "Paul", "Holgate", "Vasse", "M2411128 \"Zingy\"",
"M2412136 ", NA, NA, NA, NA, NA, "M2412144", "Tonkin", "Georgie",
"Mandy", "Flic", NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
-65L))
You should use a barplot instead of a histogram for weekly counts, which requires some data manipulation first:
end_week <- as.numeric(difftime(end_date, start_date, units="week"))
df1 %>%
mutate(week=as.numeric(round(
difftime(Date.of.admission, start_date, units="weeks")))) %>%
count(week) %>%
ggplot() +
geom_col(aes(week, n), fill= "black")+
labs(x = "Admission week", y = "Weekly cases") +
theme(axis.text.x=element_text(angle = 45, vjust = 1, hjust=1))+
scale_x_continuous(breaks=seq(0, end_week, 7),
labels=\(x) format(start_date + x*7, "%d %b %y"),
limits=c(-1, end_week))
If you prefer more (or less) x-axis labels, you can adjust the "7" in the last line (`breaks=seq()`) to a lower (or higher) number.