rpurrrr-sf

How to fill missing values based on neighboring polygons in R?


Suppose the following dataset and add some missing values (only for illustration):

library(dplyr)
library(sf)

demo(nc, ask = FALSE, verbose = FALSE)
nc$AREA[c(30, 45)] <- NA

I can get all the neighboring polygons for each county:

nc %>% mutate(
   INTERSECT = purrr::map(.x = geometry, .f = st_intersects, y = st_geometry(nc))
)

This gives me a list of the indices of neighboring counties for each row. Now I'd like to fill the missing area values with the mean area of neighboring polygons. How would I use these indices to take the mean over the corresponding rows?


Solution

  • index <- st_touches(nc, nc)
    
    output <- nc %>% 
      mutate(AREA = ifelse(is.na(AREA),
                                 apply(index, 1, function(i){mean(.$AREA[i])}),
                                 AREA))
    
    output$AREA[c(30, 45)]
    
    [1] 0.1510 0.1335
    

    Checking the answers:

    Indices for the two polygons' neighbors.

    index[c(30, 45)]
    
    [[1]]
    [1] 13 14 29 37 48
    
    [[2]]
    [1] 44 87
    

    Find the areas manually.

    mean(output$AREA[index[[30]]])
    
    [1] 0.151
    
    mean(output$AREA[index[[45]]])
    
    [1] 0.1335