rggplot2survival-analysis

How can I add a risk table with cumulative events and number at risk using ggplot2


I have been trying to plot a cumulative incidence function (CIF) for death and epilepsy events in R. My code is like:

library(survminer)
library(cmprsk)
library(ggplot2)
library(purrr)

set.seed(123)

# Sample size
n <- 1000

# Create test dataset
bd_cs <- data.frame(
  id = 1:n,
  exposure = factor(
    sample(c("unexposed", "ZIKV_EXP", "CZS"), n, replace = TRUE),
    levels = c("unexposed", "ZIKV_EXP", "CZS")
  ),
  futime_month = round(runif(n, 0, 48), 2),  # follow-up times between 0 and 48 months
  outcome = factor(
    sample(c(0, 1, 2), n, replace = TRUE),
    levels = c(0, 1, 2)
  )
)

# Check it
head(test_data)

cif_model <- cmprsk::cuminc(ftime = bd_cs$futime_month,
                            fstatus = bd_cs$outcome,
                            group = bd_cs$exposure,
                            cencode = "0")

cif_plot <- survminer::ggcompetingrisks(
  fit = cif_model,
  multiple_panels = FALSE,
  xlab = "\n Age (months)",
  ylab = "Cumulative incidence of event \n",
  title = ""
)

cif_plot$mapping <- aes(x = time, y = est, color = group, linetype = event)

cif_plot <- cif_plot +
  labs(linetype = "Outcome", color = "Exposure") +
  geom_line(linewidth = 1) +
  scale_color_manual(
    labels = c("CZS", "Unexposed", "ZIKV exposed"),
    values = c("orange", "magenta", "blue")
  ) +
  scale_linetype_manual(
    values = c("solid", "dotted"),
    labels = c("Epilepsy", "Death")
  ) +
  scale_y_continuous(
    limits = c(0, 0.6),
    breaks = seq(0, 0.6,0.05)
  ) +
  scale_x_continuous(
    limits = c(0, 48),
    breaks = seq(0, 48, 6)
  ) +
  theme_bw()

#Creating table risk manually


time_point <- seq(0, 48, by = 6)

risk_tbl <- bd_cs %>%
  group_split(exposure) %>%
  map_dfr(function(group_df) {
    tibble(
      exposure = unique(group_df$exposure),
      !!!set_names(
        map(time_point, function(tp) sum(group_df$futime_month >= tp)),
        paste0("number.at.risk_", time_point)
      ),
      !!!set_names(
        map(time_point, function(tp) sum(group_df$outcome == "2" & group_df$futime_month <= tp)),
        paste0("death_", time_point)
      ),
      !!!set_names(
        map(time_point, function(tp) sum(group_df$outcome == "1" & group_df$futime_month <= tp)),
        paste0("epilepsy_", time_point)
      )
    )
  })

risk_tbl_long <- risk_tbl %>%
  pivot_longer(
    cols = c(-exposure),
    names_to = c("data", "time"),
    names_sep = "_",
    values_to = "values"
  )

The issue here is that my dataset is too large for using survival or other packages that natively provide risk tables. How can I add the risk table (risk_tbl_long) below my cif_plot?

I would like something like this:

Desired result

I have tryied multiple options but none so far has been effective


Solution

  • TBMK all out-of-the-box options which build on ggplot2 create the risk tables as a separate ggplot which is then combined with the chart of the survival curves using patchwork. But of course you can do this step manually, i.e., create your table manually using ggplot2, then combine using patchwork:

    library(patchwork)
    library(ggplot2)
    
    p_table <- ggplot(risk_tbl_long, aes(as.numeric(time), exposure)) +
      geom_text(aes(label = values), size = 8 / .pt) +
      facet_wrap(~data, ncol = 1) +
      scale_x_continuous(breaks = NULL) +
      labs(x = NULL, y = NULL) +
      theme_minimal() +
      theme(
        panel.grid = element_blank(),
        strip.text.x = element_text(face = "bold", hjust = 0)
      )
    
    cif_plot / p_table
    

    cumulative incidence plot + risk table