rgtsummary

Formatting gtsummary table layout to show subcategories on first column


I am struggling to modify formatting of gtsummary table to allow presentation of results for each category (species) by their region (both in first column) across seasons (columns)

Load required libraries

library(dplyr) library(tidyr) library(gtsummary)

Create a sample dataframe

mydf <- data.frame(
  species = c("Species1", "Species1", "Species1", "Species2", "Species2", "Species2", "Species3", "Species3"),
  season = c("Spring", "Spring", "Summer", "Spring", "Spring", "Summer", "Spring", "Summer"),
  region = c("Region1", "Region2", "Region1", "Region2", "Region1", "Region1", "Region1", "Region1")
)

Create a summary table by season and for each species, subdivide by region

myTbl <- mydf %>%
  tbl_summary(
    by = season,
    missing = "no",
    type=all_continuous() ~ "continuous2",
    statistic = list(all_continuous2() ~ c("{sum}", "{median} ({min} - {max})")),
    label = list(
      species ~ "Species",
      region ~ "Region",
      season ~ "Season"
    )
  ) %>%
  modify_table_body(
    ~.x%>%
      mutate(
        across(all_stat_cols(),~gsub("^0.*", "-", .))
      )
  )%>%
  bold_labels() %>%
  bold_levels() %>%
  modify_footnote(everything() ~ NA)%>%
  italicize_levels() %>%
  as_gt() %>%  
  tab_style(
    style = list(cell_fill(color = "#DBE5F1")),
    locations = cells_column_labels()
  )

myTbl

This gives the following table

enter image description here

I am trying to make the table look more like this (faked in excel, #s are faked, too)

enter image description here

I can readily modify the results along the top row (columns headings), but I have not been able to format the data in my preferred manner (the second, faked table).


Solution

  • I think the gtsummary::tbl_strata() will get you the table you're after. Example below!

    library(gtsummary)
    packageVersion("gtsummary")
    #> [1] '1.7.2'
    
    mydf <- data.frame(
      species = c("Species1", "Species1", "Species1", "Species2", "Species2", "Species2", "Species3", "Species3"),
      season = c("Spring", "Spring", "Summer", "Spring", "Spring", "Summer", "Spring", "Summer"),
      region = c("Region1", "Region2", "Region1", "Region2", "Region1", "Region1", "Region1", "Region1") |> factor()
    )
    
    tbl <- 
      tbl_strata(
        data = mydf,
        strata = species,
        \(df_subset) {
          tbl_summary(
            data = df_subset,
            by = season,
            missing = "no",
            label = list(region = "Region", season = "Season")
          ) |> 
            modify_table_body(
              ~.x %>% mutate(across(all_stat_cols(),~gsub("^0.*", "-", .)))
            ) |> 
            remove_row_type(type = "header") |> 
            modify_header(all_stat_cols() ~ "**{level}**")
        },
        .combine_with = "tbl_stack"
      )
    

    enter image description here

    Created on 2024-05-11 with reprex v2.1.0