rsurvival-analysisrms

survplot() in "rms" : number of risk is shown by the whole group, not for each curve


I was plotting survival curves using survplot{rms}. However, when I used n.risk = TRUE to plot the number of risk table, R gave me a number of the whole data set, not for each curve, and I cannot figure out why.

# initialize survival commands in R

survive <- Surv(dat$dx_lastcontact_death_months, dat$event)

library(rms)

ff <- cph(survive ~ radiation, data = dat,x = T, y = T)

survplot(ff,radiation,conf.int = 0.95,
         lty = c(1,1,1), col = c("red","blue","yellow"), xlab = "", ylab = "",
         xlim = c(0,60), time.inc = 12, label.curves = F, n.risk = T)

n.risk.shown.wrong

For example, at time =0, the number of risk should be 7442 3210 3042, now the plot is showing 13694, the sum of the three groups. Could anyone help me figure out where is going wrong? Thanks?


Solution

  • @WeihuangWong is possibly correct. I get the same sort of output as did you when I use the first example in survplot, but adding a strat() function around a categorical variable results in the expected output format. I did also add surv=T.

    n <- 1000
    set.seed(731)
    age <- 50 + 12*rnorm(n)
    label(age) <- "Age"
    sex <- factor(sample(c('male','female'), n, TRUE))
    cens <- 15*runif(n)
    h <- .02*exp(.04*(age-50)+.8*(sex=='female'))
    dt <- -log(runif(n))/h
    label(dt) <- 'Follow-up Time'
    e <- ifelse(dt <= cens,1,0)
    dt <- pmin(dt, cens)
    units(dt) <- "Year"
    dd <- datadist(age, sex)
    options(datadist='dd')
    
    f <- cph(Surv(dt,e) ~ strat(sex), x=T,y=T, surv=T)
    survplot(f, sex, label.curves = F, n.risk = T)
    

    enter image description here