rdplyracross

Create summary row by group for the count of a particular observation


I have a dataset that looks like this

df1 <- tibble(ha_name = c("Interior", "Interior", "Interior", "Interior", "Interior", "Interior",
                          "Fraser", "Fraser", "Fraser", "Fraser", "Fraser", "Fraser",
                          "Northern", "Northern", "Northern", "Northern", "Northern", "Northern"),
              facility = c("Facility 1", "Facility 2", "Facility 3", "Facility 4", "Facility 5", "Facility 6",
                           "Facility 1", "Facility 2", "Facility 3", "Facility 4", "Facility 5", "Facility 6",
                           "Facility 1", "Facility 2", "Facility 3", "Facility 4", "Facility 5", "Facility 6"),
              pre_mandate = c("Planning", "Implementation", "Planning", "Planning", "Implementation", "Planning",
                              "Implementation", "Planning", "Planning", "Planning", "Implementation", "Implementation",
                              "Planning", "Planning", "Planning", "Implementation", "Planning", "Planning"),
              current = c("Planning", "Implementation", "Implementation", "Planning", "Implementation", "Planning",
                          "Implementation", "Planning", "Planning", "Implementation", "Implementation", "Implementation",
                          "Planning", "Implementation", "Planning", "Implementation", "Planning", "Implementation")
)

My goal is to add a summary row for each ha_name which counts the number of times "Implementation" was in the pre_mandate and the current column.

e.g. for Interior, the row would be Interior - All Interior - 2 - 3

I was trying to apply code I had found elsewhere for doing basically this but with numerical values in a single column

df2 <- df1 %>% group_by(ha_name, fiscal_year, quarter) %>% 
  group_modify(~ bind_rows(., summarise(., count = sum(count)))) %>% 
  ungroup() %>% 
  mutate(facility = coalesce(facility, paste("All", ha_name)),
         ha_name = if_else(startsWith(facility, "All"), "", ha_name))

but when I try to modify it I'm a bit stuck as to how to get it to count just the instances of "Implementation".

What I'm currently poking around with is this

f2 <- df1 %>% group_by(ha_name) %>% 
  group_modify(~ bind_rows(., summarise(., across(pre_mandate:current, `%in%`, "Implementation")))) %>% 
  ungroup() %>% 
  mutate(facility = coalesce(facility, paste("All", ha_name)),
         ha_name = if_else(startsWith(facility, "All"), "", ha_name))

but the usage of across with %in% doesn't return a single value. Not sure if this is close and I just need to add one more function to the across to sum the results or if I'm way off the mark.

Any help would be appreciated!


Solution

  • The issue is that using %in% will return a boolean vector. But as you want the count of true values you have to wrap in sum (and convert the result to a character):

    library(dplyr, warn.conflicts = FALSE)
    
    df1 %>%
      group_by(ha_name) %>%
      group_modify(
        ~ bind_rows(., summarise(., across(
          c(pre_mandate, current), ~ as.character(
            sum(.x %in% "Implementation")
          )
        )))
      ) %>%
      ungroup() %>%
      mutate(
        facility = coalesce(facility, paste("All", ha_name))
      )
    #> # A tibble: 21 × 4
    #>    ha_name  facility   pre_mandate    current       
    #>    <chr>    <chr>      <chr>          <chr>         
    #>  1 Fraser   Facility 1 Implementation Implementation
    #>  2 Fraser   Facility 2 Planning       Planning      
    #>  3 Fraser   Facility 3 Planning       Planning      
    #>  4 Fraser   Facility 4 Planning       Implementation
    #>  5 Fraser   Facility 5 Implementation Implementation
    #>  6 Fraser   Facility 6 Implementation Implementation
    #>  7 Fraser   All Fraser 3              4             
    #>  8 Interior Facility 1 Planning       Planning      
    #>  9 Interior Facility 2 Implementation Implementation
    #> 10 Interior Facility 3 Planning       Implementation
    #> # ℹ 11 more rows