I have a dataset that looks like this
df1 <- tibble(ha_name = c("Interior", "Interior", "Interior", "Interior", "Interior", "Interior",
"Fraser", "Fraser", "Fraser", "Fraser", "Fraser", "Fraser",
"Northern", "Northern", "Northern", "Northern", "Northern", "Northern"),
facility = c("Facility 1", "Facility 2", "Facility 3", "Facility 4", "Facility 5", "Facility 6",
"Facility 1", "Facility 2", "Facility 3", "Facility 4", "Facility 5", "Facility 6",
"Facility 1", "Facility 2", "Facility 3", "Facility 4", "Facility 5", "Facility 6"),
pre_mandate = c("Planning", "Implementation", "Planning", "Planning", "Implementation", "Planning",
"Implementation", "Planning", "Planning", "Planning", "Implementation", "Implementation",
"Planning", "Planning", "Planning", "Implementation", "Planning", "Planning"),
current = c("Planning", "Implementation", "Implementation", "Planning", "Implementation", "Planning",
"Implementation", "Planning", "Planning", "Implementation", "Implementation", "Implementation",
"Planning", "Implementation", "Planning", "Implementation", "Planning", "Implementation")
)
My goal is to add a summary row for each ha_name which counts the number of times "Implementation" was in the pre_mandate and the current column.
e.g. for Interior, the row would be
Interior - All Interior - 2 - 3
I was trying to apply code I had found elsewhere for doing basically this but with numerical values in a single column
df2 <- df1 %>% group_by(ha_name, fiscal_year, quarter) %>%
group_modify(~ bind_rows(., summarise(., count = sum(count)))) %>%
ungroup() %>%
mutate(facility = coalesce(facility, paste("All", ha_name)),
ha_name = if_else(startsWith(facility, "All"), "", ha_name))
but when I try to modify it I'm a bit stuck as to how to get it to count just the instances of "Implementation".
What I'm currently poking around with is this
f2 <- df1 %>% group_by(ha_name) %>%
group_modify(~ bind_rows(., summarise(., across(pre_mandate:current, `%in%`, "Implementation")))) %>%
ungroup() %>%
mutate(facility = coalesce(facility, paste("All", ha_name)),
ha_name = if_else(startsWith(facility, "All"), "", ha_name))
but the usage of across with %in%
doesn't return a single value. Not sure if this is close and I just need to add one more function to the across to sum the results or if I'm way off the mark.
Any help would be appreciated!
The issue is that using %in%
will return a boolean vector. But as you want the count of true values you have to wrap in sum
(and convert the result to a character):
library(dplyr, warn.conflicts = FALSE)
df1 %>%
group_by(ha_name) %>%
group_modify(
~ bind_rows(., summarise(., across(
c(pre_mandate, current), ~ as.character(
sum(.x %in% "Implementation")
)
)))
) %>%
ungroup() %>%
mutate(
facility = coalesce(facility, paste("All", ha_name))
)
#> # A tibble: 21 × 4
#> ha_name facility pre_mandate current
#> <chr> <chr> <chr> <chr>
#> 1 Fraser Facility 1 Implementation Implementation
#> 2 Fraser Facility 2 Planning Planning
#> 3 Fraser Facility 3 Planning Planning
#> 4 Fraser Facility 4 Planning Implementation
#> 5 Fraser Facility 5 Implementation Implementation
#> 6 Fraser Facility 6 Implementation Implementation
#> 7 Fraser All Fraser 3 4
#> 8 Interior Facility 1 Planning Planning
#> 9 Interior Facility 2 Implementation Implementation
#> 10 Interior Facility 3 Planning Implementation
#> # ℹ 11 more rows