rstatasurvival-analysissurvivalstandard-error

How to get the exact same standard errors obtained in Stata when reproducing survival analysis in R?


I am reproducing in R some survival analysis results published in a journal. The original results were produced in Stata. Here are the original results:

Original Results

Here is the code to produce these results in R:

# load packages
library(dplyr)
library(foreign)
library(msm)
library(stargazer)

# load Svolik's original data 
data = read_stata("leaders, institutions, covariates, updated tvc.dta")

# set a t0 for each row
data = mutate(data,t0 = lag(t,default=0), .by=leadid)

# coup survival object original
survobj_coup = Surv(data[["t0"]], data[["_t"]], data$c_coup)

# coup model original
coups_original <- coxph(survobj_coup~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
      data=data, ties="breslow")

# revolt survival object original 
survobj_revolt = Surv(data[["t0"]], data[["_t"]], data$c_revolt)

# revolt model original 
revolt_original <- coxph(survobj_revolt~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ mil+ cw+ age, 
                        data=data, ties="breslow")

# natural survival object original
survobj_natural = Surv(data[["t0"]], data[["_t"]], data$c_natural)

# natural model original
natural_original <- coxph(survobj_natural~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
                        data=data, ties="breslow")

# Define a function to exponentiate coefficients
exp_coef <- function(x) {exp(x) }

# Create the table using stargazer
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef, p.auto = FALSE)

While I am able to produce the exact same coefficients (save for slight differences in rounding) with the exact same significance levels, the standard errors do not match. For example, in Model 1 in the figure (first column in Natural Causes), I obtain a standard error of 0.414 rather than 0.198 for the coefficient on Legislature (0.456*). I was reading that the differences may be due to how the standard errors are transformed (something to do with the delta method perhaps). Does anyone have any advice? Thanks.


Solution

  • Quick answer is they measure different things. In your example, Stata reports approximate standard errors of HRs, while R reports SE of coefficient se(coef). I used an example from Stata manual:

    library(survival)
    library(webuse)
    
    webuse("drugtr")
    m_1 <- coxph(Surv(studytime, died) ~ drug + age, drugtr)
    summary(m_1)
    

    SE for 'drug' is .0477017 in Stata (SE for HR), but 0.46052 in R (SE for coefficient).

    To get the same (or similar) SE values as Stata, you could borrow some functions from R packages. I used deltaMethod in car package.:

    library(car)
    deltaMethod(m_1, "exp(drug)")
    

    Now it is 0.044426, similar to the value using Stata. There may be other alternatives.