rgeometrydistanceshapefile

Distance between point coordinates from two different data frames in R


I have two shape files, with point coordinates.

#Shapefile 1:

id <- c(98,76,88)
lat <- c(28.54265,28.54474,28.54463)
long <- c(77.20034,77.19437,77.19354)
score1 <- c(3,2,0)
score2 <- c(2,1,2)
file1 <- data.frame(id,lat,long,score1,score2)

file1_sf <- st_as_sf(file1, coords = c("long", "lat"), crs = 4326L)

#Shapefile 2:

name <- c("A","B")
lat <- c(28.6705,28.6735)
long <- c(77.41588,77.28998)
feature <- c("red","yellow")

file2 <- data.frame(name,lat,long,feature)

file2_sf <- st_as_sf(file2, coords = c("long", "lat"), crs = 4326L)

Now I want to find out the the point from File 2 that is closest to a point in File 1, and the distance between them. And I want to retain all the columns.

I used st_distance() and then used a rowwise() to get the minimum distance. However, I am not able to retain all the columns.

Is there an elegant way of solving this problem? I have 40k locations in file 1 and 200 coordinates in file 2.


Solution

  • Talking about 40k rows, running Rfast::rowMins() twice comes with acceptable cost.

    x = sf::st_distance(file1_sf, file2_sf)
    i = Rfast::rowMins(x)
    d =  Rfast::rowMins(x, value = TRUE)
    cbind.data.frame(file1_sf, "NearestPointIn2" = file2_sf$name[i], "Distance" = d)
    
      id score1 score2                  geometry NearestPointIn2 Distance
    1 98      3      2 POINT (77.20034 28.54265)               B 16978.60
    2 76      2      1 POINT (77.19437 28.54474)               B 17090.98
    3 88      0      2 POINT (77.19354 28.54463)               B 17145.58
    

    A merged version:

    x = sf::st_distance(file1_sf, file2_sf)
    i = Rfast::rowMins(x)
    d =  Rfast::rowMins(x, value = TRUE)
    merge(cbind.data.frame(file1_sf, "name" = file2_sf$name[i], "Distance" = d), 
          file2_sf, by = "name")
    # rm(x, i, d)
    
      name id score1 score2                geometry.x Distance feature               geometry.y
    1    B 98      3      2 POINT (77.20034 28.54265) 16978.60  yellow POINT (77.28998 28.6735)
    2    B 76      2      1 POINT (77.19437 28.54474) 17090.98  yellow POINT (77.28998 28.6735)
    3    B 88      0      2 POINT (77.19354 28.54463) 17145.58  yellow POINT (77.28998 28.6735)
    

    I do not know how the naming should be done and connot do any better than using "name". But this is just asthetics and can be changed anytime.