I have a kml file that describes four polygons(the kml file is called "core.kml", and I have put a copy at this link). I would like to choose random points within the area defined by the polygons. Ideally I would like to choose points evenly over the whole area, so choose more points from the larger polygons than the small ones, in proportion to their area. I have borrowed the code from answer 1 here, which works beautifully with the North Carolina polygon (at least when I update it by inserting nc <- st_geometry(nc)
). But with my polygons, the st_sample
step always gives this error:
st_as_s2(): dropping Z and/or M coordinate Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented, :
Loop 0 is not valid: Edge 15 crosses edge 19
I have tried inserting sf_use_s2(FALSE)
, but I still get the same error. The plot
command does work, and the polygons look as expected. My code:
core<-st_read("core.kml")
core<-st_geometry(core)
sf_use_s2(FALSE)
core.points<-st_sample(core,5)
plot(core)
If it's about to ensure that there are points on each polygon, you might like to try something along the lines
library(sf)
sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
core = st_read("/Users/cara/Desktop/core.kml") |> st_geometry() |> st_make_valid()
core
#> Geometry set for 4 features
#> Geometry type: GEOMETRY
#> Dimension: XY
#> Bounding box: xmin: -79.06133 ymin: 33.69862 xmax: -78.90788 ymax: 33.8351
#> Geodetic CRS: WGS 84
#> MULTIPOLYGON (((-79.014 33.80998, -79.00692 33....
#> POLYGON ((-79.06133 33.82931, -79.05514 33.8306...
#> POLYGON ((-78.99585 33.78962, -78.99496 33.7901...
#> POLYGON ((-78.91353 33.70372, -78.91197 33.7052...
ppwa = \(sfc, pts) {
# points per weighted area
a = st_area(sfc)
p = proportions(a)
as.numeric(p * pts)
}
# ensure sum(n) == pts, see below
floor_preserve = \(x) {
fx = floor(x)
i = tail(order(x - fx), round(sum(x)) - sum(fx))
fx[i] = fx[i] + 1
fx
}
n = ppwa(core, 25) |> floor_preserve()
n
#> [1] 14 2 6 3
sum(n)
#> [1] 25
set.seed(1119)
p = st_sample(core, n)
#> although coordinates are longitude/latitude, st_intersects assumes that they
#> are planar
#> ... gets repeated ...
plot(core)
points(p, pch = 4, col = palette.colors()[7])
#> Warning in st_centroid.sfc(x, of_largest_polygon = of_largest_polygon):
#> st_centroid does not give correct centroids for longitude/latitude data
Notice that (1) we haven't corrected issues with core
. Please have a look on the warnings. This might be a good discussion to start your research; islands are difficult; (2) the total number of points is not random, as we set it to 25
. The number of points per polygon is therefore not random randon: it's 25
multiplied by the relative area of each polygon, i.e. weighted.