rggplot2bar-chartfill

Dodged Bar Chart


I am trying to make a dodged bar chart (ggplot) using a data set containing county level vaccination data. I want my x axis to be the 2 counties and I want to fill based on specific column names (based on the race in the column). I am not sure what I should put as my y-axis. I think I need to rotate the data set and add a column with race to then use that to fill the data. But not sure how to do that.

I have tried to rotate the data but am not able to make the first row the column names. current code Data set


Solution

  • The operation your are looking for is called “reshaping” or “pivoting”. The tidyr package provides pivot_longer() and pivot_wider() for that purpose. Below is an exampple of how to pivot_longer() your data and create a bar plot.

    library(tidyverse)
    
    # Creating parts of your dataset with random numbers.
    # (As some of the commenters pointed out, it would be better
    # if you could include your data in your question 
    # in a reproducible fashion.)
    ms_1 <- tibble(
      COUNTY_NAME = c("Alleghany", "Montgomery", "Total"),
      COUNT_TOTAL = sample(50:500, 3),
      COUNT_ETH_NHL = sample(50:500, 3),
      COUNT_ETH_HL = sample(50:500, 3),
      COUNT_ETH_UNKNOWN = sample(50:500, 3),
      COUNT_RACE_AIAN = sample(50:500, 3),
      COUNT_RACE_ASIAN = sample(50:500, 3),
      COUNT_RACE_BLACK = sample(50:500, 3),
      COUNT_RACE_WHITE = sample(50:500, 3),
    )
    
    # Pivot, filter, select data and plot bars.
    ms_1 |> 
      # Remove the following line if you want to include Total in the plot
      filter(COUNTY_NAME != "Total") |>
      select(COUNTY_NAME, contains("_RACE_")) |> 
      pivot_longer(-COUNTY_NAME, names_to = "race", values_to = "count") |> 
      mutate(race = str_remove(race, "COUNT_RACE_")) |> 
      ggplot(aes(COUNTY_NAME, count, group = race, fill = race)) +
      geom_bar(stat = "identity", position = "dodge")