rdplyrjanitor

Force tabyl to show even columns with all 0 as values


The tabyl function from the janitor package generates a frequency table of the specified variables. By default, it excludes missing values from the output.

Let's say I have the following dataframe:

df <- data.frame( var_1 = c(rep("ja",7),rep("nein",13)),
                  var_2 = c(31:50),
                  var_3 = c(rep(0,10),rep(1,10)))

Then the output of tabyl will be like:

library(janitor)
df %>% 
  tabyl(var_1,var_3)

# var_1 0  1
#ja     7  0
#nein   3  10

Which is awesome. However, let's say I have the following dataframe:

df2 <- data.frame( var_1 = c(rep("ja",7),rep("nein",13)),
                  var_2 = c(31:50),
                  var_3 = c(rep(0,10),rep(0,10)))

Then the output will be like:

df2 %>% 
  tabyl(var_1,var_3)

# var_1  0
#ja      7
#nein    13

Where I want it to be

 var_1 0  1
    ja 7  0
  nein 13 0

However, If there are no observations in the dataset with the combination of var_1 and var_3, the corresponding row and column in the output table of tabyl will not be shown, even if show_missing_levels = TRUE and show_na = TRUE:

df2 %>% 
  tabyl(var_1,var_3, show_missing_levels = TRUE, show_na = TRUE) 
#var_1  0
#ja     7
#nein   13

Any idea how to archive that, preferable within pipe and tabyl?


Solution

  • Convert your var_3 to a factor with the appropriate levels:

    df2 %>% 
      mutate(var_3 = factor(var_3, levels = 0:1)) %>% 
      tabyl(var_1, var_3)
    
     var_1  0 1
        ja  7 0
      nein 13 0