rcrosstabcategorical-datachi-squaredmultivariate-testing

How can I cross all variables against each other and gather Chi Square test values in R?


I would like to set all the 40 categorical variables in my datafile against each other (= 160 crosstabs) and gather the p-values of all the Chi-Square tests, preferably in one list, in order to see which variables are most closely related.

Is there an R code to execute this request in a simple way?


Solution

  • You can use comb function to find all combinations and run any number of variables against each other.

    As a simple solution, if you have a data.table named dt, and the independent variable is result, then use the following code.

    library(data.table)
    library(magrittr)
    library(dplyr)
    
    chi_dt <- dt %>%
      map(~chisq.test(.x, dt$result)) %>%
      tibble(names = names(.), data = .) %>%
      mutate(stats = map(data, broom::tidy)) %>%
      unnest(stats)  %>% select(-data) %>%
      arrange(p.value, desc(statistic))