rcausality

Different outcomes using CMAverse with bootstrap inference and imputation estimation in R


Using CMAverse in R I can't seem to get the same results when I run the same model twice. Example code:

# Load packages and set seed.
library(CMAverse)
library(tidyverse)
library(magrittr)
library(janitor)
set.seed(1)

# Simulate data containing a continuous baseline confounder C1, a binary baseline confounder C2, a binary exposure A, a binary mediator M and a binary outcome Y. 
expit <- function(x) exp(x)/(1+exp(x))
n <- 10000
C1 <- rnorm(n, mean = 1, sd = 0.1)
C2 <- rbinom(n, 1, 0.6)
A <- rbinom(n, 1, expit(0.2 + 0.5*C1 + 0.1*C2))
M <- rbinom(n, 1, expit(1 + 2*A + 1.5*C1 + 0.8*C2))
Y <- rbinom(n, 1, expit(-3 - 0.4*A - 1.2*M + 0.5*A*M + 0.3*C1 - 0.6*C2))
data <- data.frame(A, M, Y, C1, C2)

# Run causal mediation analysis.
model <- cmest(data = data, model = "rb", outcome = "Y", exposure = "A",
                               mediator = "M", basec = c("C1", "C2"), EMint = TRUE,
                               mreg = list("logistic"), yreg = "logistic",
                               astar = 0, a = 1, mval = list(1), yval=1,
                               estimation = "imputation", inference = "bootstrap", nboot = 10)

# Get the summary of the model.
summary <- model %>% summary() 
summary$summarydf %>% clean_names() %>% mutate_at(vars(estimate, x95_percent_cil, x95_percent_ciu), ~format(round(., digits=2), nsmall=2, trim=TRUE)) %>% mutate(estimate=paste0(estimate, " (", x95_percent_cil, "-", x95_percent_ciu, ")")) %>% select(-x95_percent_cil, -x95_percent_ciu) %>% mutate(p_val=format(round(p_val, digits=3), nsmall=3))
 # The pure natural direct effect is 1.42 (1.27-1.87) and the pure natural indirect effect is 0.92 (0.89-0.98).

# Rerun the causal mediation analysis.
model <- cmest(data = data, model = "rb", outcome = "Y", exposure = "A",
               mediator = "M", basec = c("C1", "C2"), EMint = TRUE,
               mreg = list("logistic"), yreg = "logistic",
               astar = 0, a = 1, mval = list(1), yval=1,
               estimation = "imputation", inference = "bootstrap", nboot = 10)

# Get the summary of the model.
summary <- model %>% summary() 
summary$summarydf %>% clean_names() %>% mutate_at(vars(estimate, x95_percent_cil, x95_percent_ciu), ~format(round(., digits=2), nsmall=2, trim=TRUE)) %>% mutate(estimate=paste0(estimate, " (", x95_percent_cil, "-", x95_percent_ciu, ")")) %>% select(-x95_percent_cil, -x95_percent_ciu) %>% mutate(p_val=format(round(p_val, digits=3), nsmall=3))
 # The pure natural direct effect is 1.43 (1.08-1.75) and the pure natural indirect effect is 0.91 (0.85-0.99).

How do I get R to give me the same results when rerunning the same model?


Solution

  • I haven't run your entire script as CMAverse is not available for the version of R I'm running, but I'm guessing you haven't been running set.seed() every time you run your code? If you want the same values each time you call rnorm(), you have to run set.seed() each time. For example:

    set.seed(1)
    rnorm(10)
    # [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684  0.4874291
    # [8]  0.7383247  0.5757814 -0.3053884
    rnorm(10)
    # [1]  1.51178117  0.38984324 -0.62124058 -2.21469989  1.12493092 -0.04493361 -0.01619026
    # [8]  0.94383621  0.82122120  0.59390132
    

    will produce vectors with different values, whereas:

    set.seed(1)
    rnorm(10)
    # [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684  0.4874291
    # [8]  0.7383247  0.5757814 -0.3053884
    set.seed(1)
    rnorm(10)
    # [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684  0.4874291
    # [8]  0.7383247  0.5757814 -0.3053884
    

    will repeat the same vector over and over. If this doesn't solve your issue, let me know and I will explore your issue further.