I am trying to represent multiple dichotomous variables as a subgroup of another variable.
Below a sample of my current code:
set.seed(123)
library(tidyverse)
library(gtsummary)
library(bstfun)
d <- tidyr::tibble(
id = c(1:20),
et_yn = structure(factor(sample(seq(0,1), size = 20, replace = TRUE), levels = c(0,1), labels = c("No", "Yes")), label = "ET")
) %>%
rowwise() %>%
mutate(
et_typ_1 = factor(ifelse(et_yn == "Yes", sample(seq(0,1), size = 1, replace = T), NA), levels = c(0,1), labels = c("No", "Yes")),
et_typ_2 = factor(ifelse(et_yn == "Yes", sample(seq(0,1), size = 1, replace = T), NA), levels = c(0:1), labels = c("No", "Yes")),
et_gp_1 = factor(ifelse(et_typ_2 == "Yes", sample(seq(0,1), size = 1, replace = T), NA), levels = c(0,1), labels = c("No", "Yes")),
et_gp_2 = factor(ifelse(et_typ_2 == "Yes", sample(seq(0,1), size = 1, replace = T), NA), levels = c(0,1), labels = c("No", "Yes")),
et_gp_3 = factor(ifelse(et_typ_2 == "Yes", sample(seq(0,1), size = 1, replace = T), NA), levels = c(0,1), labels = c("No", "Yes")),
) %>%
data.frame() %>%
mutate(
et_typ_2 = structure(et_typ_2, label = "ET gp"),
et_gp_1 = structure(et_gp_1, label = "St.a"),
et_gp_2 = structure(et_gp_2, label = "St.b"),
et_gp_3 = structure(et_gp_3, label = "St.c"),
)
d %>%
tbl_summary(
data = .,
include = -id,
missing = "no",
) %>%
add_n() %>%
modify_header(stat_0 = "**n (%)**") %>%
bstfun::add_variable_grouping(
"GP" = c("et_gp_1", "et_gp_2", "et_gp_3"),
) %>%
gtsummary::as_kable()
Characteristic | N | n (%) |
---|---|---|
ET | 20 | 9 (45%) |
et_typ_1 | 9 | 3 (33%) |
ET gp | 9 | 4 (44%) |
GP | ||
St.a |
4 | 2 (50%) |
St.b |
4 | 1 (25%) |
St.c |
4 | 1 (25%) |
Created on 2024-12-30 with reprex v2.1.1
However the output I would like is to have all St.<x>
without an N
and the GP
row with that same N
, like so:
Characteristic | N | n (%) |
---|---|---|
ET | 20 | 9 (45%) |
et_typ_1 | 9 | 3 (33%) |
ET gp | 9 | 4 (44%) |
GP | 4 | |
St.a |
2 (50%) | |
St.b |
1 (25%) | |
St.c |
1 (25%) |
I tried appending the following code at the bottom of the previous code:
d %>%
tbl_summary(
data = .,
include = -id,
missing = "no",
) %>%
add_n() %>%
modify_header(stat_0 = "**n (%)**") %>%
bstfun::add_variable_grouping(
"GP" = c("et_gp_1", "et_gp_2", "et_gp_3"),
) %>%
modify_table_body(
fun = ~ .x %>%
mutate(n = ifelse(variable == "GP", max(n), NA))
)
which doesn't unfortunately result in the desired output and actually seems to have no effect.
What am I doing wrong?
The bstfun::add_variable_grouping()
function is also a piece of code I would like to replace.
Thank you.
Here's an example of getting the table you're after.
library(gtsummary)
tbl <- trial |>
tbl_summary(
include = c("age", "response", "death"),
type = c("response", "death") ~ "dichotomous",
missing = "no"
) |>
add_n()
# print the table_body to see what is available to manipulate
tbl$table_body
#> # A tibble: 3 × 7
#> variable var_type row_type var_label label n stat_0
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 age continuous label Age Age 189 47 (38, 57)
#> 2 response dichotomous label Tumor Response Tumor Response 193 61 (32%)
#> 3 death dichotomous label Patient Died Patient Died 200 112 (56%)
# make final modificatons to the table
tbl |>
# begin by indenting the sub variables
modify_column_indent(
columns = "label",
rows = .data$variable %in% c("response", "death"),
indent = 4
) |>
# next add a row for the grouping
modify_table_body(
~ .x |>
tibble::add_row(label = "Treatment Outcome", .before = 2L)
) |>
# finally, remove Ns for the subgroups, and add an N to the grouping level
modify_table_body(
~ .x |>
dplyr::mutate(
n = dplyr::case_when(
# put any number you want here, it's not always the case that the subgroups
# will have the same number of non-missing observations (as in this example)
label == "Treatment Outcome" ~ stringr::str_glue("{nrow(trial)}"),
variable %in% c("response", "death") ~ NA,
.default = .data$n
)
)
) |>
as_kable() # convert to kable to display on stackoverflow
Characteristic | N | N = 200 |
---|---|---|
Age | 189 | 47 (38, 57) |
Treatment Outcome | 200 | |
Tumor Response | 61 (32%) | |
Patient Died | 112 (56%) |
Created on 2024-12-30 with reprex v2.1.1