I am learning how to analyse the data and present the results of an RCT using R. I have tried reading the package documentation, and searched online but did not find a solution for this. I have 2 groups of participants, and want to express the baseline data for both the groups, change in each group (from baseline to endpoint), and difference between endpoint - all of these for each outcome - in one table. I have attached an example table below.
I have simulated a dataframe, and tried writing a code and will discuss the issues here
ID <- seq(1:50)
data <- data.frame(ID)
data$drug <- rbinom(n = 50, 1, prob = 0.5)
data$drug <- factor(data$drug, levels = c(0, 1),
labels = c("Drug X", "Drug Y"))
data$wt_0 <- rnorm(n = 50, mean = 70, sd = 5)
data$wt_12 <- rnorm(50, 68, 4.9)
head(data)
library(gtsummary)
library(gt)
subset(data, select = -ID) %>%
tbl_summary(by = drug) %>%
add_p()
I tried adding the change in weight column manually
data_new <- data
data$wt_change <- data$wt_0 - data$wt_12
subset(data_new, select = -ID) %>%
tbl_summary(by = drug) %>%
add_p()
I want a table like the one shown at first. And, each row should only correspond to one outcome. Is it feasible using gtsummary() package or any other package in R? It would be great if someone could help because it may be a common scenario
Note: Yes, multiplicity adjustment is not being violated as such, we will state that the other testing (except primary test) is exploratory and should not be interpreted as such
To get just one row for each variable (weight, BMI, etc.), it may be necessary to use a reshaped data frame:
df <- data %>%
tidyr::pivot_longer(starts_with("wt"),
names_to="week", values_to="weight", names_prefix="wt_")
# A tibble: 100 x 4
ID drug week weight
<int> <fct> <chr> <dbl>
1 1 Drug X 0 66.3
2 1 Drug X 12 70.2
3 2 Drug X 0 72.3
4 2 Drug X 12 69.6
5 3 Drug X 0 78.2
Then you can utilize the tbl_summary
with "by=week" inside a tbl_strata
function, stratifying on drug, and then adding add_difference()
to obtain your "Mean change" column for each drug.
tbl_1 <- df |>
select(-ID) |>
tbl_strata(strata = drug,
.tbl_fun = ~ tbl_summary(.x, by = week,
label=list(weight~"Weight (kg)"),
digits=list(everything() ~ 2),
statistic = list(all_continuous() ~ "{mean} ({sd})")) |>
add_difference(estimate_fun = weight~function(x) style_number(x, digits = 2)),
.header = "**{strata}**") |>
modify_header(all_stat_cols() ~ "**{level} weeks**",
estimate_1 ~"**Mean change**",
estimate_2 ~"**Mean change**")
tbl_1
Unfortunately, add_difference()
calculates group 1 - group 2, when you probably want group 2 - group 1.
To get the "T-test" column that compares the changes over time between the two drugs, again you can use add_difference()
.
tbl_2 <- mutate(data, weight=wt_0 - wt_12) |>
select(drug, weight) |>
tbl_summary(by=drug,
label=list(weight~"Weight (kg)"),
digits=list(everything() ~ 2)) |>
add_difference(estimate_fun=weight~function(x) style_number(x, digits = 2)) |>
modify_column_hide(c(stat_1, stat_2))
tbl_2
And because we ensured that the names and labels of the two calculated variables were the same, we can use tbl_merge
to join these two gtsummary objects together:
tbl_merge(list(tbl_1, tbl_2)) |>
modify_spanning_header(ends_with("1_1")~"**Drug X**",
ends_with("2_1")~"**Drug Y**",
ends_with("_2")~"**T-Test**")
Data:
set.seed(123) # data created by OP.