I have a Cox regression model as follows:
library(dplyr)
library(survival)
library(eha)
df <- data.frame(id = c("a", "a", "a", "b", "b", "c", "c", "c"),
sex = c("0", "0", "0", "1", "1", "0", "0", "0"),
start = c(0, 1, 5, 0, 7, 0, 2, 10),
stop = c(1, 5, 10, 7, 18, 2, 10, 19),
status = c(1, 1, 0, 1, 0, 1, 1, 0))
id sex start stop status
1 a 0 0 1 1
2 a 0 1 5 1
3 a 0 5 10 0
4 b 1 0 7 1
5 b 1 7 18 0
6 c 0 0 2 1
7 c 0 2 10 1
8 c 0 10 19 0
# Sex violates proportional hazards
logrank(Surv(start, stop, status), group = sex, data = df)
# Regression
fit <- coxph(Surv(start, stop, status) ~ strata(sex) + cluster(id), data = df)
cluster(id)
represents the implementation of the Anderson-Gill model for recurrent events. I want to estimate the (non-cumulative) hazard function from this.
I know I can obtain the cumulative baseline hazard using:
# Cumulative hazard
basehaz(fit)
But how to obtain the non-cumulative hazard? Any help is appreciated.
The basehaz
function will return results for stratified models. Essentially it breaks the analysis into stratum-specific hazards:
df <- data.frame(id = c("a", "a", "a", "b", "b", "c", "c", "c"),
sex = c("0", "0", "0", "1", "1", "0", "0", "0"),
start = c(0, 1, 5, 0, 7, 0, 2, 10),
stop = c(1, 5, 10, 7, 18, 2, 10, 19),
status = c(1, 1, 0, 1, 0, 1, 1, 0))
fit <- coxph(Surv(start, stop, status) ~ strata(sex) + cluster(id), data = df)
basehaz(fit)
#------
hazard time strata
1 0.5 1 0
2 1.0 2 0
3 1.5 5 0
4 2.0 10 0
5 2.0 19 0
6 1.0 7 1
7 1.0 18 1
Contrast that result with the results from a "null" model.
fit <- coxph(Surv(start, stop, status) ~ 1, data = df)
basehaz(fit)
#------------
hazard time
1 0.3333333 1
2 0.6666667 2
3 1.0000000 5
4 1.3333333 7
5 1.6666667 10
6 1.6666667 18
7 1.6666667 19
The results are consistent with the usual explanation that stratified model deliver a weighted average of the results in two separated strata.