I plot cumulative incidence curves based on data from the survival::survfit()
function. Following this vignette I do something like the following:
library(survival)
mgus2$etime <- with(mgus2, ifelse(pstat==0, futime, ptime))
event <- with(mgus2, ifelse(pstat==0, 2*death, 1))
mgus2$event <- factor(event, 0:2, labels=c("censor", "pcm", "death"))
mfit2 <- survfit(Surv(etime, event) ~ sex, data=mgus2)
plot(mfit2)
The data has two possible events (pcm and death) and the included covariate sex has two levels (male and female). Thus, as expected, the resulting plot depicts the the probabilities for four lines (I left out the labels because they don't matter here).
But how to extract this data? I looked everywhere inside mfit2
but can not find the data needed to create the plot. There is mfit2$time
for the x axis, but mfit2$pstate
does not include all four probabilities (pcm female, pcm male, death female and death male). Where are those probabilities hidden in mfit2
?
I found this answer which suggests some workaround in order to get the data. I see that the workaround works, that's not the question. Where is the data for the plot inside mfit2
? It must be somewhere since plot(mfit2)
returns as the plot we would expect.
You're after the $pstate
, which contains the concatenated probabilities for the two levels of sex
(females are the first level).
levels(mgus$sex)
#[1] "female" "male"
The length of mfit2$pstate
(and mfits$time
) is 454, the first 227 corresponds to probabilities for females (red).
mfit2$strata
sex=F sex=M
227 227
You can see that this is correct by plotting the fitted object and overlaying the lines based on the above information.
plot(mfit2)
lines(mfit2$time[1:227], mfit2$pstate[1:227, 2], type="s", col="red", lty=2)
lines(mfit2$time[228:454], mfit2$pstate[228:454, 2], type="s", col="green", lty=2)
lines(mfit2$time[1:227], mfit2$pstate[1:227, 3], type="s", col="red", lty=2)
lines(mfit2$time[228:454], mfit2$pstate[228:454, 3], type="s", col="green", lty=2)