My apologies if this question is a duplicate or asked somewhere already.
I like to create summary tables with this format.
1) Discrete variables : n/N (%)
2a) Continuous variables : mean (SD); N
2b) Continuous variables : median (IQR); N
For example if this is my data
# Example dataset
set.seed(123)
data <- data.frame(
ChildSex = sample(c("Male", "Female"), 5006, replace = TRUE),
col1 = rnorm(5006, mean = 300, sd = 100),
col2 = rnorm(5006, mean = 400, sd = 150),
col3 = rnorm(5006, mean = 470, sd = 200)
)
The expected summary should appear like this
Discrete Variables
Child sex
Male 2505/5006 (50%)
Female 2501/5006 (50%)
Data missing 0 /5006 (0%)
Continuous Variables: mean (SD); N
Col1 299.90 (99.38); 5006
Col2 399.12 (151.530); 5006
Continuous Variables: median (IQR); N
Col3 465.85 (268.15); 5006
I have around 20 discrete variables and 30 continuous variables (18 mean,sd and 12 median, IQR). I like to create a summary table as shown above without having to enter the variable names or levels manually. Thankful for any suggestions or advise in advance..
set.seed(123)
data <- data.frame(
ChildSex = c(sample(c("Male", "Female"), 5005, replace = TRUE), NA),
col1 = rnorm(5006, mean = 300, sd = 100),
col2 = rnorm(5006, mean = 400, sd = 150),
col3 = rnorm(5006, mean = 470, sd = 200)
)
data
tbl_summary(data,
type=list(all_continuous()~"continuous2"),
statistic = list(c(col1,col2) ~ "{mean} ({sd}); {N_nonmiss}",
col3 ~ "{median} ({p25}-{p75}); {N_nonmiss}",
all_categorical() ~"{n}/{N_nonmiss} ({p}%)"),
digits=list(all_continuous()~c(2, 2, 0, 0)),
missing="ifany",
missing_text = "Data missing",
missing_stat = "{N_miss} / {N_obs} ({p_miss}%))")
Gives