rcluster-analysisgeospatialspdepgwr

How to input dissimilarity matrix in spatial analysis in spdep R


Aim: I want to create a dissimilarity matrix between pairs of coordinates. I want to use this matrix as an input to calculate local spatial clusters using Moran's I (LISA) and latter in geographically weighted regression (GWR).

Problem: I know I can use dnearneigh{spdep} to calculate a distance matrix. However, I want to use the travel-time between polygons I already have estimated. In practice, I think this would be like inputting a dissimilarity matrix that tells the distance/difference between polygons based on a another characteristic. I've tried inputting my matrix to dnearneigh{spdep}, but I get the error Error: ncol(x) == 2 is not TRUE

dist_matrix <- dnearneigh(diss_matrix_invers, d1=0, d2=5, longlat = F, row.names=rn)

Any suggestions? There is a reproducible example below:

EDIT: Digging a bit further, I think I could use mat2listw{spdep} but I'm still not sure it keeps the correspondence between the matrix and the polygons. If I add row.names = T it returns an error row.names wrong length :(

listw_dissi <- mat2listw(diss_matrix_invers)
lmoran <- localmoran(oregon.tract@data$white, listw_dissi, 
                     zero.policy=T, alternative= "two.sided")

Reproducible example

library(UScensus2000tract)
library(spdep)
library(ggplot2)
library(dplyr)
library(reshape2)
library(magrittr)
library(data.table)
library(reshape)
library(rgeos)
library(geosphere)

# load data
  data("oregon.tract")

# get centroids as a data.frame
  centroids <- as.data.frame( gCentroid(oregon.tract, byid=TRUE) )

# Convert row names into first column
  setDT(centroids, keep.rownames = TRUE)[]

# create Origin-destination pairs
  od_pairs <- expand.grid.df(centroids, centroids) %>% setDT()
  colnames(od_pairs) <- c("origi_id", "long_orig", "lat_orig", "dest_id", "long_dest", "lat_dest")     

# calculate dissimilarity between each pair. 
# For the sake of this example, let's use ellipsoid distances. In my real case I have travel-time estimates
  od_pairs[ , dist := distGeo(matrix(c(long_orig, lat_orig), ncol = 2), 
                         matrix(c(long_dest, lat_dest), ncol = 2))]

# This is the format of how my travel-time estimates are organized, it has some missing values which include pairs of origin-destination that are too far (more than 2hours apart)
  od_pairs <- od_pairs[, .(origi_id, dest_id, dist)]
  od_pairs$dist[3] <- NA

  >      origi_id    dest_id         dist
  > 1:   oregon_0   oregon_0      0.00000
  > 2:   oregon_1   oregon_0           NA
  > 3:   oregon_2   oregon_0  39874.63673
  > 4:   oregon_3   oregon_0  31259.63100
  > 5:   oregon_4   oregon_0  33047.84249

# Convert to matrix
  diss_matrix <- acast(od_pairs, origi_id~dest_id, value.var="dist") %>% as.matrix()

# get an inverse matrix of distances, make sure diagonal=0
  diss_matrix_invers <- 1/diss_matrix
  diag(diss_matrix_invers) <- 0

Calculate simple distance matrix

  # get row names
    rn <- sapply(slot(oregon.tract, "polygons"), function(x) slot(x, "ID"))
  # get centroids coordinates
    coords <- coordinates(oregon.tract)
  # get distance matrix
    diss_matrix <- dnearneigh(diss_matrix_invers, d1=0, d2=5, longlat =T, row.names=rn)

class(diss_matrix)
> [1] "nb"

Now how to use my diss_matrix_invers here?


Solution

  • you are right about the use of matlistw{spdep}. By default the function preserves the names of rows to keep correspondence between the matrix. You can also specify the row.names like so:

    listw_dissi <- mat2listw(diss_matrix_invers, row.names = row.names(diss_matrix_invers))  
    

    The list that is created will contain the appropriate names for the neighbours along with their distance as weights. You can check this by looking at the neighbours.

    listw_dissi$neighbours[[1]][1:5]
    

    And you should be able to use this directly to calculate Moran's I.

    dnearneigh{sdep}
    There is no way you can use diss_matrix within dnearneigh{spdep}, as this function takes in a list of coordinates.

    however, if you need to define a set of neighbours given a distance threshold (d1,d2) using your own distance matrix (travel-time). I think this function can do the trick.

    dis.neigh<-function(x, d1 = 0, d2=50){
      #x must be a symmetrical distance matrix
      #create empty list
      style = "M" #for style unknown
      neighbours<-list()
      weights<-list()
      #set attributes of neighbours list
      attr(neighbours, "class")<-"nb"
      attr(neighbours, "distances")<-c(d1,d2)
      attr(neighbours, "region.id")<-colnames(x)
    
      #check each row for neighbors that satisfy distance threshold
      neighbour<-c()
      weight<-c()
      i<-1
      for(row in c(1:nrow(x))){
        j<-1
        for(col in c(1:ncol(x))){
          if(x[row,col]>d1 && x[row,col]<d2){
            neighbour[j]<-col
            weight[j]<-1/x[row,col] #inverse distance (dissimilarity)
            j<-1+j
          }
        }
        neighbours[i]<-list(neighbour)
        weights[i]<-list(weight)
        i<-1+i
      }
    
      #create neighbour and weight list
      res <- list(style = style, neighbours = neighbours, weights = weights)
      class(res) <- c("listw", "nb")
      attr(res, "region.id") <- attr(neighbours, "region.id")
      attr(res, "call") <- match.call()
    
      return(res)
    }
    

    And use it like so:

    nb_list<-dis.neigh(diss_matrix, d1=0, d2=10000)
    lmoran <- localmoran(oregon.tract@data$white, nb_lists, alternative= "two.sided")