rsurvival-analysischi-squared

How to perform Gray's test with R package tidycmprsk for competing risk data


I have survival data with competing risk (censored 0, relaps 1, death 2) and I want to use Grey's test to check the difference between groups of a variable rh (categories 0, 1 and 2). This is my data:

> surv_data
# A tibble: 344 × 4
           ID        time_to_event          status    rh
          <dbl>             <dbl>            <dbl> <fct> 
 1            1             61.2                 0 PHLF 0
 2            2              6.53                1 PHLF 0
 3           10            109.                  0 PHLF 0
 4           15             15.5                 2 PHLF 2
 5           19              4.97                0 PHLF 0
 6           20             17.9                 1 PHLF 1
 7           22             61.7                 1 PHLF 0
 8           23             74.0                 2 PHLF 0
 9           26             67.9                 2 PHLF 1
10           27             46.8                 2 PHLF 0
# ℹ 389 more rows
# ℹ Use `print(n = ...)` to see more rows

What is the difference between Gray's test in tidycmprsk::tbl_cuminc and tidycmprsk::cuminc. I get different outputs.

I tried the following below, and I don't understand why I get different outputs of the tests. Hpw should i interpret the results and what is the right way to do Gray's test?

1)

tidycmprsk::cuminc(Surv(time_to_event, status) ~ rh, data = surv_data) %>% 
  tidycmprsk::tbl_cuminc(
    times =  c(6, 12, 24), label_header = "**Month {time}**"
  ) %>% 
  add_p() %>% 
  add_n()

and I get output:

`
Characteristic  N   Month 6       Month 12          Month 24    p-value1
rh  399                                               0.07
    rh 0        32% (15%, 33%)  45% (39%, 50%)  62% (36%, 65%)  
    rh 1        23% (9%, 28%)   37% (21%, 53%)  46% (38%, 68%)  
    rh  2       28% (12%, 42%)  27% (15%, 63%)  38% (21%, 55%)  
1 Gray’s Test
`
tidycmprsk::cuminc(ftime=surv_data$time_to_event, fstatus=surv_data$status, group=surv_data$rh, cencode=0)

and I get output:

`
Tests:
                      stat           pv df
Censored          2.20168  1.176649e-01  2
Relapse           5.75114  2.376312e-01  2
Death            28.383626 3.183201e-09  2
`
tidycmprsk::cuminc(Surv(time_to_event, status) ~ rh, data = surv_data, cencode=0)

and I get output:

`
• Tests
outcome            statistic   df     p.value    
Relapse            4.31        2.00   0.07      
Death              32.3        2.00   <0.001   
`

Solution

  • You have got yourself into difficulty because you are mixing tidycmprsk and cmprsk. Both of these packages have a function called cuminc(). Your second code snippet is:

    tidycmprsk::cuminc(ftime=surv_data$time_to_event, fstatus=surv_data$status, group=surv_data$rh, cencode=0)
    

    However, the docs for tidycmprsk::cuminc() indicate it either takes a formula or a parameter called x. The parameters you are providing (ftime, fstatus and so on) are the arguments for cmprsk::cuminc(). As all your parameters are named and none of them is x, this code will generate an error.

    The tidycmprsk::cuminc() source calls cmprsk::cuminc() under the hood, so once you resolve your argument names you will get the same results. Here's an example with the trial data built-in to the tidycmprsk package, using both approaches.

    tidycmprsk

    Note that here I am using death_cr as the status variable. As the docs state:

    The status variable must be a factor, where the first level indicates the observation was censored, and subsequent levels are the competing events.

    Make sure that your data is in this form.

    library(tidycmprsk)
    
    tidy_model <- tidycmprsk::cuminc(Surv(ttdeath, death_cr) ~ trt, trial) |>
        tbl_cuminc(times = c(6, 12, 24)) |>
        add_p() |>
        add_n()
    
    data.frame(tidy_model$table_body[1, c("statistic", "p.value")])
    #   statistic   p.value
    # 1   1.98847 0.1585009
    

    cmprsk

    This is the equivalent using cmprsk. Again as we are using the factor variable death_cr as the status, we can provide cencode = "censor", i.e. a character vector indicating the label of the factor level associated with being censored (rather than cencode = 0 in your question).

    
    gray_model <- cmprsk::cuminc(
        ftime = trial$ttdeath,
        fstatus = trial$death_cr,
        group = trial$trt,
        cencode = "censor"
    )
    
    gray_model$Tests
    #                          stat        pv df
    # death from cancer  1.98847032 0.1585009  1
    # death other causes 0.08856723 0.7660066  1
    

    You can see that the p-value and statistic are the same.

    A general comment on namespaces

    Where packages have overlapping functions (these two also have crr() in common), there are various approaches. In my example I loaded tidycmprsk with library(), and called cmprsk functions explicitly with ::. This is about the minimum you should do in these cases. The Google R Style Guide in fact suggests that:

    Users should explicitly qualify namespaces for all external functions.

    Of course this advice makes more sense for a huge company with a massive codebase than a single researcher who may not generally face a huge risk of namespace collisions.

    Another approach which has many advantages is to use the box package. This allows you to (amongst other things) explicitly import the functions you want to use from specific packages rather than importing all functions from a package, which library() does.