rggplot2

how to build a barplot using ggplot2 with 2 sets of bars: 1st set for all observations per group; 2nd set for number of observations meeting criteria?


I have a table that looks like this

tibble (person = c ("Paul", "Paul", "Paul", "Sarah", "Sarah", "Sarah", "Alex", "Alex", "Alex"),
    salary = c(8, 10, 3, 6, 6, 6, 1, 1, 3))

how would I build a barplot with two sets of bars side-by-side that shows:

  1. the total number of salaries observations per person would give bars c(3, 3, 3)
  2. the number of observations with salary higher than 5 would give bars c(2, 3, 0)

Solution

  • I wouldn't advise trying to do this in ggplot directly. Rather, you should precompute your summaries and plot those:

    library(tidyverse)
    
    df <- tibble(person = c ("Paul", "Paul", "Paul", "Sarah", "Sarah", "Sarah", "Alex", "Alex", "Alex"),
                  salary = c(8, 10, 3, 6, 6, 6, 1, 1, 3))
    
    salary_counts <- df |> 
      summarize(
        total_salary = n(),
        high_salary = sum(salary > 5),
        .by = person
      ) |> 
      pivot_longer(cols = c(total_salary, high_salary), names_to = 'count_type', values_to = 'count')
    
    salary_plot <- salary_counts |> 
      ggplot(data = _, aes(x = person, y = count, fill = count_type)) +
      geom_col(position = 'dodge')
    print(salary_plot)
    

    enter image description here