I am working on a problem that uses data that resembles the following snippet:
name = c('A', 'B', 'C', 'D', 'E','F','G','H','I','J')
lat = runif(10,min=37,max=39)
lon = runif(10,min=-91,max=-89)
metric = runif(10,min=0,250)
df = as.data.frame(cbind(name,lat,lon,metric))
I want to be able to choose a radius of X kilometers, let's say 15 for this example, and use that radius to first count the # of observations that fall within the range for each row of my data where each row gets a turn as the center of the radius. Then, I want to take an average of the metric for all the observations within the radius.
Ideally, the final dataset would look something like this:
name lat lon metric count avg_metric
A 37.1 -91.1 44 6 34.33
B 37.2 -90.2 24 3 23.35
C 37.3 -90.3 37 2 19.93
These numbers are made up, but something that looks like this is what I'm aiming for.
I found some code that kind of did what I was looking for below:
library(spatialrisk)
# Stores
stores <- data.frame(store_id = 1:3,
lat = c(40.7505, 40.7502, 40.6045),
long = c(-73.8456, -73.8453, -73.8012))
# My location
me <- data.frame(lat = 40.7504, long = -73.8456)
spatialrisk::points_in_circle(stores, me$long[1], me$lat[1], radius = 100, lon = long)
However, this isn't exactly what I'm looking for. What can I try next?
Here's an example using the sf
library. I converted to a coordinate system that understands metres and then used st_buffer
to make circles, st_within
to work out the overlaps and then sapply
to calculate the count and average metric.
library(sf)
library(dplyr)
name <- c('A', 'B', 'C', 'D', 'E','F','G','H','I','J')
lat <- runif(10,min=37,max=39)
lon <- runif(10,min=-91,max=-89)
metric <- runif(10,min=0,250)
df <- as.data.frame(cbind(name,lat,lon,metric))
# make a spatial data frame
df_sf <- st_as_sf(df, coords = c('lon', 'lat'), crs=4326)
# transform to a coordinate system which understands metres
# the original data is near St Louis, so UTM 15 is appropriate
df_sf_transformed <- st_transform(df_sf, crs = 32615)
# make circles of radius 15km around each point
df_sf_buffers <- st_buffer(df_sf_transformed, dist = 15000) %>% st_geometry()
# work out which point is within each circle
overlaps <- st_within(y = df_sf_buffers, x = df_sf_transformed)
# count the number of overlaps for each circle
# the sapply function is like a for loop that iterates over each
# row in the overlaps object and calls the length function to
# calculate the length of each row
counts <- data.frame(count = sapply(overlaps, length))
# a custom function to sum the metrics for selected points
find_avg_metric <- function(sf_df, indices) {
entries <- sf_df[indices, ]
result <- sum(as.numeric(entries$metric))
result
}
# iterate over the overlaps object and use the indices found to access
# the relevant rows of the original data frame to sum up the metric column
metrics <- data.frame(avg_metric = sapply(overlaps, function(index) find_avg_metric(df_sf_transformed, index)))
# The final answer
final_result <- df %>% bind_cols(counts) %>% bind_cols(metrics)
final_result
#name lat lon metric count avg_metric
#1 A 37.9383664987981 -90.567177023273 30.6012906949036 1 30.60129
#2 B 37.8796539758332 -89.6363721224479 139.638845692389 1 139.63885
#3 C 37.9519012323581 -90.8471902268939 106.271727592684 1 106.27173
#4 D 38.3774091359228 -89.1159357768483 186.965031607542 1 186.96503
#5 E 37.3993454128504 -89.7444549454376 29.5921949436888 2 194.88581
#6 F 38.6436389596201 -89.0847784169018 150.506012840196 1 150.50601
#7 G 38.4185414239764 -90.1974869910628 162.634110951331 1 162.63411
#8 H 38.2770276684314 -90.4734013564885 69.9945208616555 1 69.99452
#9 I 37.3666391288862 -89.6443270738237 165.293618629221 2 194.88581
#10 J 38.659956720192 -89.7836953452788 123.549888376147 1 123.54989