rregressionlogistic-regressionspatialautocorrelation

Accounting for Spatial Autocorrelation in Model


I am trying to account for spatial autocorrelation in a model in R. Each observation is a country for which I have the average latitude and longitude. Here's some sample data:

country <- c("IQ", "MX", "IN", "PY")
long <- c(43.94511, -94.87018, 78.10349, -59.15377)
lat <- c(33.9415073, 18.2283975, 23.8462264, -23.3900255)
Pathogen <- c(10.937891, 13.326284, 12.472374, 12.541716)
Answer.values <- c(0, 0, 1, 0)

data <- data.frame(country, long, lat, Pathogen, Answer.values)

I know spatial autocorrelation is an issue (Moran's i is significant in the whole dataset). This is the model I am testing (Answer Values (a 0/1 variable) ~ Pathogen Prevalence (a continuous variable)).

model <- glm(Answer.values ~ Pathogen,
             na.action = na.omit,
             data = data,
             family = "binomial")

How would I account for spatial autocorrelation with a data structure like that?


Solution

  • There are a lot of potential answers to this. One easy(ish) way is to use mgcv::gam() to add a spatial smoother. Most of your model would stay the same:

    library(mgcv)
    gam(Answer.values ~ Pathogen +s([something]),
        family="binomial",
        data=data)
    

    where s([something]) is some form of smooth spatial term. Three possible/reasonable choices would be:

    A helpful link for getting up to speed with GAMs in R ...