I have this dataset: https://public.opendatasoft.com/explore/dataset/georef-netherlands-postcode-pc4/export/?location=6,52.1564,5.29337&basemap=jawg.light
I need to upload it to R and extract the centroid of the polygons in terms of longitude and latitude, so that I can use it to calculate distances between two centroids.
So far I've done this:
PC4shape <- read_sf(dsn = "georef-netherlands-postcode-pc4", layer = "georef-netherlands-postcode-pc4")
PC4shape$centroids <- st_transform(PC4shape, 28992) %>%
st_centroid() %>%
st_transform(., '+proj=longlat +ellps=GRS80 +no_defs') %>%
st_geometry()
Following Function to calculate geospatial distance between two points (lat,long) using R
However, I am sure I have the projection and the EPSG wrong. I just don't know how to find the right one. Any help is greatly appreciated.
It isn't clear between which centroids you want to measure, so there's a sample below of couple of different ways to find some or all distances.
The code below finds the centroid of each row, shows the distance between centroids of the first two rows, and then a small distance matrix of the first 10 rows.
The crs is the original wgs 84, distances are in meters. I didn't see a need to change the crs.
library(tidyverse)
library(sf)
x <- read_sf('shapefile_dir/') # Replace with local directory name
x <- x %>% mutate(centroids = st_centroid(st_geometry(.)))
#Distance between Goeree-Overflakkee (row 1) and Dordrecht (row 2):
st_distance(x$centroids[1], x$centroids[2])
#> Units: [m]
#> [,1]
#> [1,] 53317.83
#distance matrix of first 10 rows:
x[1:10,] %>%
st_set_geometry('centroids') %>% # Since there are 2 geometry cols, this sets the active one to 'centroids'
st_distance()
#> Units: [m]
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0.00 53317.826 49330.003 85844.745 76152.903 69198.288 89437.405
#> [2,] 53317.83 0.000 7355.978 38618.416 37934.810 31387.648 45356.081
#> [3,] 49330.00 7355.978 0.000 38510.970 34633.274 27580.688 44165.444
#> [4,] 85844.75 38618.416 38510.970 0.000 16873.684 19856.326 8741.436
#> [5,] 76152.90 37934.810 34633.274 16873.684 0.000 7298.181 14836.265
#> [6,] 69198.29 31387.648 27580.688 19856.326 7298.181 0.000 20611.458
#> [7,] 89437.40 45356.081 44165.444 8741.436 14836.265 20611.458 0.000
#> [8,] 84873.91 42274.715 40485.556 9634.087 9915.349 15823.963 4928.774
#> [9,] 88788.51 45795.090 44243.643 10476.055 13443.233 19690.611 2255.828
#> [10,] 94857.25 50872.407 49779.022 13205.312 19479.782 25802.514 5617.354
#> [,8] [,9] [,10]
#> [1,] 84873.911 88788.511 94857.246
#> [2,] 42274.715 45795.090 50872.407
#> [3,] 40485.556 44243.643 49779.022
#> [4,] 9634.087 10476.055 13205.312
#> [5,] 9915.349 13443.233 19479.782
#> [6,] 15823.963 19690.611 25802.514
#> [7,] 4928.774 2255.828 5617.354
#> [8,] 0.000 3922.534 9994.837
#> [9,] 3922.534 0.000 6115.383
#> [10,] 9994.837 6115.383 0.000
# A distance matrix of all pairs of points can be found by not
# subsetting the data:
# x %>% st_set_geometry('centroids') %>% st_distance()
Created on 2022-03-10 by the reprex package (v0.3.0)