I have spatial data and would like to create a new column in my dataframe that calculates the average distance to neighboring coordinates within a radius of my choice.
I found this code that gets me kind of close:
library(sf)
library(dplyr)
library(leaflet)
# three semi random cities in NC
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333)) %>%
sf::st_as_sf(coords = c("x", "y"), crs = 4326)
#Buffer of 2 arc degrees around Greensboro
buffer <- sf::st_buffer(cities[2, ], 200000)
# Draw Leaflet map
leaflet() %>%
addProviderTiles("OpenStreetMap") %>%
addPolygons(data = buffer) %>%
addCircleMarkers(data = cities,
color = "red",
fillOpacity = 2/3,
stroke = F)
# Count # of coordinates in radius
count <- sf::st_join(cities, buffer, left = F) %>%
nrow()
I want to be able to modify this code with the adjustable radius and say, when I used X as the radius, the average distance between Greensboro and whatever other points are captured in the circle is = 15.4 miles.
Can someone help point me in the right direction of how to solve this problem?
This can be most easily done using the sfdep package.
The idea is to create a neighbor matrix using a distance band. Then, with those neighbors, find the distances from the focal point. Lastly, iterate over the resultant list column to calculate the average.
library(sf)
library(sfdep)
library(dplyr)
library(leaflet)
# three semi random cities in NC
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333)) %>%
as_tibble() |>
sf::st_as_sf(coords = c("x", "y"), crs = 4326)
cities |>
mutate(
nb = st_dist_band(geometry),
dists = st_nb_dists(geometry, nb),
avg_dist = purrr::map_dbl(dists, mean)
)
#> Simple feature collection with 3 features and 4 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -79.81944 ymin: 34.22333 xmax: -77.91222 ymax: 36.08
#> Geodetic CRS: WGS 84
#> # A tibble: 3 × 5
#> name geometry nb dists avg_dist
#> * <chr> <POINT [°]> <nb> <list> <dbl>
#> 1 Raleigh (-78.63333 35.76667) <int [2]> <dbl [2]> 148.
#> 2 Greensboro (-79.81944 36.08) <int [1]> <dbl [1]> 112.
#> 3 Wilmington (-77.91222 34.22333) <int [1]> <dbl [1]> 184.