rgtr-flextable

Presentation of calculated data within a table omitting the value of the respective column


I have data like this: People are attributed to specific Locations, and it has been recorded whether an event has been successful or the data is missing.

df <- data.frame(PersonID = c(1:20),
                 Location = c("B","A","D","C","A","D","C","D","A","D","B","A","D","C","A","D","C","D","A","D"),
                 Success = c("yes","no","yes",NA,"yes","no","no","yes",NA,"yes","no","yes",NA,"yes","no","no","yes",NA,"yes","no"))

I would like to know how each location "performs" relative to the other locations, i. e. how many valid attempts have been successful and how the location's rate of success compares to the other locations.

So in my example, location "A" has seen 5 valid attempts (1 "NA"), of which 3 were successful (60%). The other locations have a success rate of 50%, 66.7%, and 50%, an average of 55.6%. Location A is thus 4.4 percentage points higher than the other locations' average. I want to display all that information in a table just like this:

table

I don't have a specific preference regarding packages, but I like and know some gt and flextable.


Solution

  • This was not trivial in any way:

    library(dplyr)
    library(tidyr)
    library(purrr)
    library(tibble)
    library(janitor)
    library(gt)
    
    df1 <- df %>% 
      group_by(Location) %>% 
      mutate(attempts = sum(!is.na(Success)),
             yeses = sum(Success == "yes", na.rm = TRUE),
             Success_rate = (yeses/attempts)*100) 
      
    df2 <- df1 %>% 
      summarise(avgother = mean(Success_rate)) %>% 
      mutate(avgother = map_dbl(row_number(), ~mean(avgother[-.x])))
      )
    
    df %>% 
      group_by(Location) %>% 
      summarise(attempts = sum(!is.na(Success)),
             yeses = sum(Success == "yes", na.rm = TRUE),
             Success_rate = (yeses/attempts)*100) %>% 
      bind_cols(avgother= round(df2$avgother, 1)) %>% 
      mutate(comp.avg = Success_rate - avgother) %>% 
      mutate(`attempts` = paste0("(N=", attempts, ")"),
             Success_rate = paste0(round(Success_rate, 1), "%")) %>% 
      select(-yeses) %>% 
      mutate(comp.avg = ifelse(comp.avg >0, paste0("(+",round(comp.avg, 1),")"), paste0("(",round(comp.avg,1),")"))) %>% 
      t() %>% 
      as.data.frame() %>% 
      rownames_to_column("Location") %>% 
      row_to_names(row_number = 1) %>% 
      gt() 
    

    enter image description here