rjanitor

Cumulative Crosstable with tabyl()


I picked up an old project from a few months ago to make adjustments. In this project I use the janitor package to create tables with cumulative percentages. This has definitely worked before. Now the same code returns an error.


    library(tidyverse)
    library(janitor)
    library(scales)
    
    tb <- tibble(
        survey = c(1, 3, 4, 4, 2, 1, 4, 1, 2, 2),
        diagnosis = factor(c("yes", "yes", "yes", "no", "yes", "yes", "no", "no", "yes", "yes"))
    )
      
    tb %>% 
        tabyl(survey, diagnosis) %>% 
        mutate(cum_no = percent(cumsum(no)/sum(no), accuracy = 0.1)) %>% 
        mutate(cum_yes = percent(cumsum(yes)/sum(yes), accuracy = 0.1)) %>% 
        adorn_totals(where = "both", ... = c(yes, no)) %>% 
        adorn_percentages(denominator = "col", na.rm = TRUE, ... = c(yes, no, Total)) %>%
        adorn_pct_formatting() %>%
        adorn_ns(position = "front", ... = c(yes, no, Total))
    #> Error in adorn_ns(., position = "front", ... = c(yes, no, Total)): argument "ns" cannot be null; if not calling adorn_ns() on a data.frame of class "tabyl", pass your own value for ns

Created on 2022-06-03 by the reprex package (v2.0.1)

From what I can tell the problem lies in the "cum_no" and "cum_yes" variables where I calculate the cumulative percentages. When I remove those lines everything works a it should. My idea is, that adorn_ns takes the Ns from the core attribute of the tabyl class:

    > str(tabyl(tb, survey, diagnosis))
    Classes ‘tabyl’ and 'data.frame':   4 obs. of  3 variables:
     $ survey: num  1 2 3 4
     $ no    : num  1 0 0 2
     $ yes   : num  2 3 1 1
     - attr(*, "core")='data.frame':    4 obs. of  3 variables:
      ..$ survey: num [1:4] 1 2 3 4
      ..$ no    : num [1:4] 1 0 0 2
      ..$ yes   : num [1:4] 2 3 1 1
     - attr(*, "tabyl_type")= chr "two_way"
     - attr(*, "var_names")=List of 2
      ..$ row: chr "survey"
      ..$ col: chr "diagnosis"

After I mutate the cum variables the core attribute is gone:


    Classes ‘tabyl’ and 'data.frame':   4 obs. of  5 variables:
     $ survey : num  1 2 3 4
     $ no     : num  1 0 0 2
     $ yes    : num  2 3 1 1
     $ cum_no : chr  "33.3%" "33.3%" "33.3%" "100.0%"
     $ cum_yes: chr  "28.6%" "71.4%" "85.7%" "100.0%"

What I don't understand is how to deal with this problem and why it used to work but doesn't anymore (maybe the package has been updated?)

Thank you for your help and patience.


Solution

  • The issue seems to be with the additional attributes with tabyl that got lost in the subsequent operations. We may set the attributes so that it works - Below code creates an object after the tabyl step, then we do the transformations to create the second object. Se the additional attributes from the first object before we apply the adorn_ns

    library(dplyr)
    library(janitor)
    tb1 <- tb %>% 
            tabyl(survey, diagnosis) 
    tb2 <- tb1 %>%
    mutate(cum_no = percent(cumsum(no)/sum(no), accuracy = 0.1)) %>% 
            mutate(cum_yes = percent(cumsum(yes)/sum(yes), accuracy = 0.1)) %>% 
            adorn_totals(where = "both", ... = c(yes, no)) %>% 
            adorn_percentages(denominator = "col", na.rm = TRUE, ... = c(yes, no, Total)) %>%
            adorn_pct_formatting()
            
            
    nm1 <- setdiff(names(attributes(tb1)), names(attributes(tb2)))
    attributes(tb2)[nm1] <- attributes(tb1)[nm1]
    tb2 %>% 
      adorn_ns(position = "front", ,  c(yes, no, Total))
     survey         no        yes cum_no cum_yes      Total
          1 1  (33.3%) 2  (28.6%)  33.3%   28.6% 1  (30.0%)
          2 0   (0.0%) 3  (42.9%)  33.3%   71.4% 0  (30.0%)
          3 0   (0.0%) 1  (14.3%)  33.3%   85.7% 0  (10.0%)
          4 2  (66.7%) 1  (14.3%) 100.0%  100.0% 2  (30.0%)
      Total 3 (100.0%) 7 (100.0%)      -       - 3 (100.0%)