rgtsummary

How to build differently a descriptive statistics table by using the gtsummary library


I've build the following descriptive statistics table for the dataset I'm working on. And I've obtained it by using the code you could find here.

sum_stats = function(data, group, value, alpha=0.05)data %>%
  group_by(!!enquo(group)) %>%
  summarise(
    n = n(),
    q1 = quantile(!!enquo(value),1/4,8),
    min = min(!!enquo(value)),
    mean = mean(!!enquo(value)),
    median = median(!!enquo(value)),
    q3 = quantile(!!enquo(value),3/4,8),
    max = max(!!enquo(value)),
    sd = sd(!!enquo(value)),
    stderr = sd/sqrt(n),
    kurtosis = e1071::kurtosis(!!enquo(value)),
    skewness = e1071::skewness(!!enquo(value)),
    LCL = mean - qt(1 - (0.05 / 2), n - 1) * stderr,
    UCL = mean + qt(1 -(0.05 / 2), n - 1) * stderr,
    #SW.stat = ShapiroTest(!!enquo(value), alpha)$statistic,
    #SW.p = ShapiroTest(!!enquo(value), alpha)$p.value,
    #SW.test = ShapiroTest(!!enquo(value), alpha)$test,
    nout = length(boxplot.stats(!!enquo(value))$out)
  )

nested_out <- out %>% 
  mutate(COND = factor(COND)) %>%
  group_by(signals) %>% 
  nest() 

stats_nested <- nested_out %>% group_by(signals) %>%
  mutate(stats = map(data, ~sum_stats(.x, COND, value))) %>% 
  unnest(stats) %>% 
  dplyr::select(-'data') %>% 
  flextable() %>% 
  merge_v(j = 'signals') %>% 
  colformat_double(digits = 2)

If I would like to obtain quitre the same output, by using the gtsummary() package, what should I do?

Here you could find a short extract of the the dataset

structure(list(ID = c("01", "01", "01", "01", "01", "01"), GR = c("RP", 
"RP", "RP", "RP", "RP", "RP"), SES = c("V", "V", "V", "V", "V", 
"V"), COND = c("NEG-CTR", "NEG-CTR", "NEG-CTR", "NEG-CTR", "NEG-CTR", 
"NEG-CTR"), signals = c("P3(400-450).FCz", "P3(400-450).Cz", 
"P3(400-450).Pz", "LPPearly(500-700).FCz", "LPPearly(500-700).Cz", 
"LPPearly(500-700).Pz"), value = c(-11.6312151716924, -5.16524399006139, 
11.8802266972569, -11.7785042972793, -5.96429031525769, 8.23981597718437
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

Thanks in advance for those who will answer.


Solution

  • The gtsummary package will typically place multiple statistics in a single column. If you'd like each statistic in a separate column, you can iterate your calls to tbl_summary() for each summary statistics. You will then have a list of tbl_summary() objects, and you can merge them with tbl_merge() into a single table. Example below!

    library(gtsummary)
    #> #Uighur
    packageVersion("gtsummary")
    #> [1] '1.5.0'
    
    # iterate over summary statistics
    c("mean", "sd", "min", "max") %>%
      lapply(
        function(.x) {
          tbl_summary(
            data = trial, 
            statistic = everything() ~ paste0("{", .x, "}"),
            include = c(age, marker),
            missing = "no"
          ) %>%
            modify_header(all_stat_cols() ~ glue::glue("**{.x}**"))
        }
      ) %>%
      # merge all tables to get one column per summary statistic
      tbl_merge() %>%
      # remove the default headers and footnotes
      modify_spanning_header(everything() ~ NA) %>%
      modify_footnote(everything() ~ NA) %>%
      as_kable() # convert to kable to display on SO
    
    Characteristic mean sd min max
    Age 47 14 6 83
    Marker Level (ng/mL) 0.92 0.86 0.00 3.87

    Created on 2021-10-28 by the reprex package (v2.0.1)