I was wondering if anyone proficient in R/RMarkdown would be able to guide me with an issue. I am looking to generate a frequency table and so far, I have been using tableby of the arsenal package as it is easy and convenient to integrate in a RMarkdown docx/html. However, I have been asked to provide rounded frequencies (to the nearest 5 or 10) and have been trying to find ways to do it without much success.
I have generated a fake simple dataset as I cannot share my data for confidentialy reason and this is how I would do a normal table.
set.seed(1234)
library(dplyr)
library(arsenal)
x1 <- c(rep("Man",40),rep("Woman",60)) %>% as.factor()
x2 <- sample(c("Sick","Healthy"),100,replace=TRUE) %>% as.factor()
df <- data.frame(x1,x2)
Control_notrounded <- tableby.control(digits=0,digits.pct=2,cat.stats=c("countpct","Nmiss2"))
table <- tableby(x1~x2,control=Control_notrounded,data=df)
print(summary(table))
However, even though rounding to the nearest 10 with a traditional rounding function is performed by passing digits=-1, this does not seem to be a working approach with that function as I get a warning indicating that digits must be >=0.
Control_rounded <- tableby.control(digits=-1,digits.pct=2,cat.stats=c("countpct","Nmiss2"))
table2 <- tableby(x1~x2,control=Control_rounded,data=df)
print(summary(table2))
Is there any way to do that? Otherwise, would anyone have an alternative package that would allow to create relatively straightforwardly frequency tables with rounded values?
I can recommend using the gtsummary
package for creating baseline tables instead - then try the following round_5_gtsummary()
function from this little GitHub package:
set.seed(1234)
library(dplyr)
library(gtsummary)
library(stringr)
x1 <- c(rep("Man",40),rep("Woman",60)) %>% as.factor()
x2 <- sample(c("Sick","Healthy"),100,replace=TRUE) %>% as.factor()
df <- data.frame(x1,x2)
install.packages("devtools")
devtools::install_github("zheer-kejlberg/Z.gtsummary.addons")
library(Z.gtsummary.addons)
df %>% tbl_summary(by = "x1") %>%
add_overall(last = TRUE) %>%
round_5_gtsummary() %>%
add_p()
WEIGHTED VERSION
# Create IPT weights
library(WeightIt)
df$w <- weightit(x1~x2, data = df, estimand = "ATT", focal = "Man")$weights
Use survey to create a svydesign object. Then apply tbl_svysummary()
to that:
library(survey)
df %>% survey::svydesign(~1, data = ., weights = ~w) %>%
tbl_svysummary(by = "x1", include=c(x2)) %>%
add_overall(last = TRUE) %>%
round_5_gtsummary() %>%
add_p()
ALTERNATIVE WAY:
To use the built-in tbl_summary(digits=)
argument to separately round the counts and percentages, you can do:
library(gtsummary)
library(dplyr)
set.seed(1234)
round_5 <- function(vec) {
fun <- function(x) {
if (x < 1) { return(round(x*100/5)*5)
} else { return(round(x/5)*5) }
}
vec <- purrr::map_vec(vec, .f = fun)
}
df <- data.frame(
x1 = c(rep("Man", 40), rep("Woman", 60)) %>% as.factor(),
x2 = sample(c("Sick", "Healthy"), 100, replace = TRUE) %>% as.factor()
)
df %>%
tbl_summary(
by = "x1",
digits = all_categorical() ~ round_5
) %>%
add_overall(last = TRUE) %>%
add_p()
Results:
Note, this version doesn't recalculate percentages after rounding the counts; rather, it just rounds both separately.