This question is related to another that I posted here:
How to visually assess tt() suitability in coxph?
If we have a time-varying HR that arises from a time-dependent coefficient because we have specified a time-transform using tt(), how do we plot that?
Using the example from the survival vignette, if we specified a model incorporating a spline term on karno because of non-proportional hazards:
library(survival)
fit.tt3 <- coxph(Surv(time, status) ~ trt + prior + karno + tt(karno), data = veteran,
tt = function(x, t, ...) x * nsk(t, knots = c(5, 20, 50, 150, 300),
Boundary.knots = FALSE))
Is there a straightforward way (well, any way would be good) to plot the HR for karno as a function of time? It would be good to be able to communicate to other members of a research team, what the time-dependence means by showing them a simple figure?
Thanks
I think I finally have an answer for this. If I understand things correctly, the key to predicting and plotting a time-covariate interaction from a coxph()
model is to split time into periods and estimate time-dependent coefficients within each period - not by using the tt()
term directly.
In the code below I have shown an example with a linear time:covariate interaction and a spline time:covariate interaction and compared time-varying HR's from both the cox model and a flexible-parametric (Royston-Parmar) model.
Hopefully this might help anyone else not sure about how to do this.
library(survival)
library(tidyverse)
# This comes from p 28 of:
# https://cran.r-project.org/web/packages/survival/vignettes/timedep.pdf
# In that example it shows how to predict for a 30-unit change in karno (keeping all other covariates = 0 in the newdat df)
# A KEY DIFFERENCE IN PREDICTING AND PLOTTING HR'S FOR A TIME-VARYING COVARIATE EFFECT BETWEEN COXPH AND RSTPM2 IS THAT THE COX MODEL REQUIRES DATA TO BE SPLIT BY THE TIME PERIODS OF INTEREST INTO COUNTING PROCESS FORMAT. FOR THE RP MODEL, ONE CAN JUST USE THE ORIGINAL DATASET OF 1 ROW PER PERSON.
# In this example, let's say we want to look at the time-varying effect of trt
# When predicting from a cox model it seems easier to keep everything numeric, as we basically want to predict on a newdat df that is zero on all covariates except for the one-unit change of the covariate of interest
# Load veteran data and convert trt to numeric, with a reference cat = 0
vdata1 <- veteran
vdata1$trt <- as.numeric(vdata1$trt) - 1
# Fit basic model with treatment as only predictor
vfit1 <- coxph(Surv(time, status) ~ trt, vdata1)
summary(vfit1)
# Most obs times are < 13 months - create monthly time intervals and split data into counting process format based on that
dtime <- round(1:13 * 30.5)
vdata2 <- survSplit(Surv(time, status) ~ ., vdata1, cut = dtime, episode = "month")
#++++++++++++++++++++++++++++++
# Now fit a linear trt*month interaction
vfit2 <- coxph(Surv(tstart, time, status) ~ trt + trt:month, vdata2)
summary(vfit2)
# Create newdat df a one unit change in trt
tdata <- expand.grid(trt = 1, month = seq(1,13, length = 50))
# Predict lp and plot - this should be linear
yhat <- predict(vfit2, newdata = tdata, se.fit = TRUE, type = "lp", reference = "zero")
tdata$fit <- yhat$fit
ggplot(tdata, aes(x = month, y = fit)) +
geom_point() +
geom_line()
# Predict HR and plot
yhat <- predict(vfit2, newdata = tdata, se.fit = TRUE, type = "risk", reference = "zero")
tdata$fit <- yhat$fit
ggplot(tdata, aes(x = month, y = fit)) +
geom_line(col = 'blue') +
scale_x_continuous(limits = c(0, 15), breaks = seq(0, 15, by = 5)) +
scale_y_continuous(limits = c(0, 1.5), breaks = seq(0, 1.5, by = 0.25)) +
xlab("Observation Time") + ylab("HR") + ggtitle("Cox Model - Linear trt:time interaction") +
theme_bw(base_size = 15)
# How does this compare to a RP model with 1 df on trt (note that we don't use CP format for this - just the original dataset)
library(rstpm2)
vfit2_rp <- stpm2(Surv(time, status) ~ trt, data = vdata1, df = 3, tvc = list(trt = 1))
summary(vfit2_rp)
## Calculate standardised HR
n_time_points <- 100
pred_hrs <- predict(vfit2_rp, type = "meanhr", seqLength = n_time_points,
newdata = transform(vdata1, trt = 0),
exposed = function(data) transform(data, trt = 1),
grid = TRUE, full = TRUE, se.fit = TRUE)
# Now plot
ggplot(pred_hrs, aes(x = time/30, y = Estimate)) +
geom_line(col = 'red') +
scale_x_continuous(limits = c(0, 15), breaks = seq(0, 15, by = 5)) +
scale_y_continuous(limits = c(0, 1.5), breaks = seq(0, 1.5, by = 0.25)) +
xlab("Observation Time") + ylab("HR") + ggtitle("Cox Model - Linear trt:time interaction") +
theme_bw(base_size = 15)
#++++++++++++++++++++++++++++++
# Now fit a spline trt*month interaction
vfit3 <- coxph(Surv(tstart, time, status) ~ trt + trt:nsk(month, df = 3), vdata2)
summary(vfit3)
# Create newdat df a one unit change in trt
tdata <- expand.grid(trt = 1, month = seq(1,13, length = 50))
# Predict lp and plot
yhat <- predict(vfit3, newdata = tdata, se.fit = TRUE, type = "lp", reference = "zero")
tdata$fit <- yhat$fit
ggplot(tdata, aes(x = month, y = fit)) +
geom_point() +
geom_line()
# Predict HR and plot
yhat <- predict(vfit3, newdata = tdata, se.fit = TRUE, type = "risk", reference = "zero")
tdata$fit <- yhat$fit
ggplot(tdata, aes(x = month, y = fit)) +
geom_line(col = 'blue') +
scale_x_continuous(limits = c(0, 15), breaks = seq(0, 15, by = 5)) +
scale_y_continuous(limits = c(0, 2), breaks = seq(0, 2, by = 0.25)) +
xlab("Observation Time") + ylab("HR") + ggtitle("Cox Model - Spline trt:time interaction") +
theme_bw(base_size = 15)
# How does this compare to a RP model with 5 df on trt (note that we don't use CP format for this - just the original dataset)
library(rstpm2)
vfit3_rp <- stpm2(Surv(time, status) ~ trt, data = vdata1, df = 3, tvc = list(trt = 5))
summary(vfit3_rp)
## Calculate standardised HR
n_time_points <- 100
pred_hrs <- predict(vfit3_rp, type = "meanhr", seqLength = n_time_points,
newdata = transform(vdata1, trt = 0),
exposed = function(data) transform(data, trt = 1),
grid = TRUE, full = TRUE, se.fit = TRUE)
# Now plot
ggplot(pred_hrs, aes(x = time/30, y = Estimate)) +
geom_line(col = 'red') +
scale_x_continuous(limits = c(0, 15), breaks = seq(0, 15, by = 5)) +
scale_y_continuous(limits = c(0, 2), breaks = seq(0, 2, by = 0.25)) +
xlab("Observation Time") + ylab("HR") + ggtitle("Royston-Parmar Model - Spline trt:time interaction") +
theme_bw(base_size = 15)