rr-markdowntibblekablesummarytools

Why does kable in R undo an summarytools frequency option to not display cumulative percentage?


I found an odd quirk or maybe an intentional feature and trying to find a more elegant solution. Any ideas are appreciated. Reproducible code example below, but I wasn't able to recreate the tables very well here with markdown.

Situation: When I use the freq() function from the summarytools package, I remove the total cumulate results using the cumul=FALSE argument for a cleaner output. This works as expected. However, when I pipe the results into a kable() table from the knitr and kableExtra packages, the total cumulative results appear in the kable table. Not sure why.

According to summarytools creator Dominic Comtois' page here:

summarytools objects are not always compatible with packages focused on table formatting, such as formattable or kableExtra. However, tb() can be used as a “bridge”, an intermediary step turning freq() and descr() objects into simple tables that any package can work with.

I assume my problem is part of the mentioned compatibility issue. When I add a tb() as bridge between freq() and kable(), my table contents appear as they should without the total cumulative results. Yet, the table headings no longer retain their heading names and revert to using the tibble's assigned column headings. Also, I lose the Total row on the bottom.

So far, I managed to rename the tibble column names to be closer to the original freq() output, but haven't yet restored the bottom Totals row. I assume there is a better code approach than what I've done thus far. Thanks in advance!

# set packages and data
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))

# initial frequency output
d |> freq(group, cumul=FALSE)

# move freq results in kable add the Total Cumulative back
d |> freq(group, cumul=FALSE) |> 
     kable() |> kable_classic(full_width=FALSE)

# changing to a tibble before kable drops the Totals row and reverts the column headings
d |> freq(group, cumul=FALSE) |> tb() |> 
     kable() |> kable_classic(full_width=FALSE)

# minor fix on the column headings but still not totals row
d |> freq(group, cumul=FALSE) |> tb() |> 
     rename("Group"="group", "Freq"="freq", "% Valid"="pct_valid", "% Valid Cum."="pct_tot")|> 
     kable() |> kable_classic(full_width=FALSE)

Solution

  • Here's a more straightforward solution:

    set.seed(99)
    d <- data.frame(group=sample(c(LETTERS[1:3], NA), size=500, replace=TRUE))
    
    freq(d$group)[,-c(3,5)] |>
      kable(digits = 1) |>
      kable_classic(full_width = FALSE)
    

    First result table

    To have the "Group" column title:

    as.data.frame(freq(d$group)[,-c(3,5)]) |> 
      tibble::rownames_to_column("Group") |>
      kable(digits = 1) |>
      kable_classic(full_width = FALSE)
    

    Second results table

    The reason kable() ends up displaying the cumulative columns is that freq() returns a matrix which always contains cumulatives; it is summarytools' print() function that determines which columns and heading elements to mask / show, based on the freq object's attributes. (Check attributes(freq(d$group), you'll see what I mean).

    An extra tip for displaying frequency tables with a "valid" column: use the following knitr/kable options:

    options(knitr.kable.NA = '')
    

    This way, we have a blank cell in lieu of "NA":

    Third results table