rarimacausality

ARIMAX exogenous variables reverse causality


I try to fit an ARIMAX model to figure out whether the containment measures (using the Government response stringency index, numbers from 0 to 100) are having a significant effect on the daily new cases rate. I also want to add test rates. I programmed everything in R (every ts is stationary,...) and did the Granger causality test. Result: Pr(>F)is greater than 0.05. Therefore the null hypothesis of NO Granger causality can be rejected and the new cases rate and the containment measures have reverse causality. Is there any possibility to transform the variable "stringency index" and continue with an ARIMAX model? If so, how to do this in R?


Solution

  • In R you have "forecast" package to build ARIMA models. Recall, that there is a difference between true ARIMAX models and linear regressions with ARIMA errors. Check this post by Rob Hyndman (forecast package author) for more detailed information: The ARIMAX model muddle

    Here are Rob Hyndman's examples to fit a linear regression with ARIMA errors - check more information here:

    library(forecast)
    library(fpp2) # To get a data set to work on
    # Fit a linear regression with AR errors
    fit <- Arima(uschange[,"Consumption"], xreg = uschange[,"Income"], order = c(1,0,0))
    # Forecast and plot predictions
    fcast <- forecast(fit, xreg=rep(mean(uschange[,2]),8))
    autoplot(fcast) + xlab("Year") +
      ylab("Percentage change")
    
    # Use auto.arima function to find the optimal parameters
    fit <- auto.arima(uschange[,"Consumption"], xreg = uschange[,"Income"])
    # Plot predictions
    fcast <- forecast(fit, xreg=rep(mean(uschange[,2]),8))
    autoplot(fcast) + xlab("Year") +
      ylab("Percentage change")
    

    Regarding your question about how to solve the reverse causality matter, it is clear that you have endogeneity bias. The response stringency index affects the daily new cases rate and viceversa. If it is a prediction problem and not an estimation one, I wouldn't care too much on that as long as I get good predictions. For an estimation/causation matter, I will try to get different exogenus variables or try to use instrumental/control variables.