I was plotting survival curves using survplot{rms}
. However, when I used n.risk = TRUE
to plot the number of risk table, R gave me a number of the whole data set, not for each curve, and I cannot figure out why.
# initialize survival commands in R
survive <- Surv(dat$dx_lastcontact_death_months, dat$event)
library(rms)
ff <- cph(survive ~ radiation, data = dat,x = T, y = T)
survplot(ff,radiation,conf.int = 0.95,
lty = c(1,1,1), col = c("red","blue","yellow"), xlab = "", ylab = "",
xlim = c(0,60), time.inc = 12, label.curves = F, n.risk = T)
For example, at time =0, the number of risk should be 7442 3210 3042, now the plot is showing 13694, the sum of the three groups. Could anyone help me figure out where is going wrong? Thanks?
@WeihuangWong is possibly correct. I get the same sort of output as did you when I use the first example in survplot
, but adding a strat()
function around a categorical variable results in the expected output format. I did also add surv=T
.
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('male','female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
f <- cph(Surv(dt,e) ~ strat(sex), x=T,y=T, surv=T)
survplot(f, sex, label.curves = F, n.risk = T)