rfunctionloopsdplyrcrosstab

Binding together a table made of multiple crosstabs


I want to create multiple weighted crosstabs for a dataset and then bind them together. I have to do this for around 30 variables, with 13 or so crosstabs each, so I am trying to figure out a way to do it as automated as possible, using dplyr, but I'm getting a little lost between all the options for writing a function and loops, etc. I am using the pollster package to create my crosstabs, so I will use the illinois dataset in that package for my example.

Say I want to make a series of tables from this dataset, where each table displays a series of crosstabs (say, sex, educ6, and raceethnic) for a given variable (say I want the tables to be for martialstatus, rv, and voter).

I have created a function that can create each crosstab table with the format I want.

create_crosstabs <- function(var1, var2) {

  var1 <- enquo(var1)
  var2 <- enquo(var2)
  
  df2 <- illinois %>% 
  pollster::crosstab(!!var1, !!var2, weight = weight) %>% 
  mutate_if(is.numeric, round) %>% 
  select(-n)
  
return(df2)
}

This works, but I am struggling to get to the next part. So what this code makes if I run

create_crosstabs(educ6, maritalstatus) 

is a table that looks like:

educ6 Married Widow/divorced Never Married
LT HS 40 29 31
HS 53 21 26
Some Col 45 17 38

I want to make it so I loop over for example, sex, educ6, and raceethnic, each time creating a crosstabs and then appending that crosstab table to the previous one, such that the table output looks like

Married Widow/divorced Never Married
educ
LT HS 40 29 31
HS 53 21 26
Some Col 45 17 38
sex
male 57 12 31
female 50 23 26
race
white 58 18 24
black 31 24 45

I've previously done something like where looped over values of a variable, rather than over multiple variables, but trying to write it for this, nothing I do seems to work.

I will mention here that ideally I would like to further iterate over the second variable such that I get a series of these tables for the crosstabs for maritalstatus, rv, and voter, but if I could at least figure out how to create the individual tables themselves, that would be a big step.


Solution

  • In the function, change the enquo to ensym

    create_crosstabs <- function(var1, var2) {
    
      var1 <- ensym(var1)
      var2 <- ensym(var2)
      
      illinois %>% 
      pollster::crosstab(!!var1, !!var2, weight = weight) %>% 
      mutate(across(where(is.numeric), round)) %>% 
      select(-n)
      
    
    }
    

    so that it can take both quoted as well as unquoted input.

    library(purrr)
    library(pollster)
    library(dplyr)
    library(gt)
    map_dfr(c(educ = "educ6", sex = "sex", race = "raceethnic"), 
       ~ create_crosstabs(!!.x, maritalstatus), .id = 'group') %>%
        group_by(group) %>%
        gt()
    

    Or may also do

    v1 <-c(educ = "educ6", sex = "sex", race = "raceethnic")
    tmp <- imap_dfr(v1, 
                    ~ create_crosstabs(!!.x, maritalstatus) %>% 
                     rename(group = 1), .id = "grp" ) 
    tmp2 <- tmp %>%
          select(-grp) %>% 
          gt(rowname_col = "group")
    for(nm in rev(names(v1)))
      tmp2 <- tmp2 %>%
      tab_row_group(label = nm, rows = which(tmp$grp == nm))
    tmp2
    

    -output

    enter image description here