rmodelsummary

modelsummary tabulate single factor variable


How can I tabulate a single factor variable with modelsummary?

I tried this:

> modelsummary::datasummary_crosstab(cyl ~ 1, data = mtcars)
Error in datasummary(formula = stats::as.formula(d_formula), data = data,  : 
  subscript out of bounds

The expected result is this (with percentages adjusted to sum to 1 over the column):

enter image description here


Solution

  • The mtcars dataset columns are all numeric, but if you specify cyl as a factor you could use datasummary_skim() on all categorical variables (effectively excluding the "num" cols), e.g.

    library(tidyverse)
    
    str(mtcars)
    #> 'data.frame':    32 obs. of  11 variables:
    #>  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
    #>  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
    #>  $ disp: num  160 160 108 258 360 ...
    #>  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
    #>  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
    #>  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
    #>  $ qsec: num  16.5 17 18.6 19.4 17 ...
    #>  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
    #>  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
    #>  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
    #>  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
    
    # make cyl a factor
    df <- mtcars
    df$cyl <- factor(df$cyl)
    modelsummary::datasummary_skim(type = "categorical", data = df)
    
    cyl N %
    4 11 34.4
    6 7 21.9
    8 14 43.8

    Created on 2025-03-11 with reprex v2.1.1

    Not sure if this will work for your use case, but might be worth exploring further.