I have two shape files, with point coordinates.
#Shapefile 1:
id <- c(98,76,88)
lat <- c(28.54265,28.54474,28.54463)
long <- c(77.20034,77.19437,77.19354)
score1 <- c(3,2,0)
score2 <- c(2,1,2)
file1 <- data.frame(id,lat,long,score1,score2)
file1_sf <- st_as_sf(file1, coords = c("long", "lat"), crs = 4326L)
#Shapefile 2:
name <- c("A","B")
lat <- c(28.6705,28.6735)
long <- c(77.41588,77.28998)
feature <- c("red","yellow")
file2 <- data.frame(name,lat,long,feature)
file2_sf <- st_as_sf(file2, coords = c("long", "lat"), crs = 4326L)
Now I want to find out the the point from File 2 that is closest to a point in File 1, and the distance between them. And I want to retain all the columns.
I used st_distance()
and then used a rowwise()
to get the minimum distance. However, I am not able to retain all the columns.
Is there an elegant way of solving this problem? I have 40k locations in file 1 and 200 coordinates in file 2.
Talking about 40k rows, running Rfast::rowMins()
twice comes with acceptable cost.
x = sf::st_distance(file1_sf, file2_sf)
i = Rfast::rowMins(x)
d = Rfast::rowMins(x, value = TRUE)
cbind.data.frame(file1_sf, "NearestPointIn2" = file2_sf$name[i], "Distance" = d)
id score1 score2 geometry NearestPointIn2 Distance
1 98 3 2 POINT (77.20034 28.54265) B 16978.60
2 76 2 1 POINT (77.19437 28.54474) B 17090.98
3 88 0 2 POINT (77.19354 28.54463) B 17145.58
A merged version:
x = sf::st_distance(file1_sf, file2_sf)
i = Rfast::rowMins(x)
d = Rfast::rowMins(x, value = TRUE)
merge(cbind.data.frame(file1_sf, "name" = file2_sf$name[i], "Distance" = d),
file2_sf, by = "name")
# rm(x, i, d)
name id score1 score2 geometry.x Distance feature geometry.y
1 B 98 3 2 POINT (77.20034 28.54265) 16978.60 yellow POINT (77.28998 28.6735)
2 B 76 2 1 POINT (77.19437 28.54474) 17090.98 yellow POINT (77.28998 28.6735)
3 B 88 0 2 POINT (77.19354 28.54463) 17145.58 yellow POINT (77.28998 28.6735)
I do not know how the naming should be done and connot do any better than using "name"
. But this is just asthetics and can be changed anytime.