I am trying to make a dodged bar chart (ggplot) using a data set containing county level vaccination data. I want my x axis to be the 2 counties and I want to fill based on specific column names (based on the race in the column). I am not sure what I should put as my y-axis. I think I need to rotate the data set and add a column with race to then use that to fill the data. But not sure how to do that.
I have tried to rotate the data but am not able to make the first row the column names.
The operation your are looking for is called “reshaping” or “pivoting”.
The tidyr
package provides pivot_longer()
and pivot_wider()
for
that purpose. Below is an exampple of how to pivot_longer()
your
data and create a bar plot.
library(tidyverse)
# Creating parts of your dataset with random numbers.
# (As some of the commenters pointed out, it would be better
# if you could include your data in your question
# in a reproducible fashion.)
ms_1 <- tibble(
COUNTY_NAME = c("Alleghany", "Montgomery", "Total"),
COUNT_TOTAL = sample(50:500, 3),
COUNT_ETH_NHL = sample(50:500, 3),
COUNT_ETH_HL = sample(50:500, 3),
COUNT_ETH_UNKNOWN = sample(50:500, 3),
COUNT_RACE_AIAN = sample(50:500, 3),
COUNT_RACE_ASIAN = sample(50:500, 3),
COUNT_RACE_BLACK = sample(50:500, 3),
COUNT_RACE_WHITE = sample(50:500, 3),
)
# Pivot, filter, select data and plot bars.
ms_1 |>
# Remove the following line if you want to include Total in the plot
filter(COUNTY_NAME != "Total") |>
select(COUNTY_NAME, contains("_RACE_")) |>
pivot_longer(-COUNTY_NAME, names_to = "race", values_to = "count") |>
mutate(race = str_remove(race, "COUNT_RACE_")) |>
ggplot(aes(COUNTY_NAME, count, group = race, fill = race)) +
geom_bar(stat = "identity", position = "dodge")