st_intersection
is very slow compared to st_intersects
. So why not use the latter instead of the former? Here's an example with a small toy dataset, but the difference in execution time is huge for my actual set of just 62,020 points intersected with an actual geographic region boundary. I have 24Gb of RAM and the st_intersects
code takes a few seconds whereas the st_intersection
code takes more than 15 minutes (possibly much more, I haven't had the patience to wait...). Does st_intersection
do anything that I am not getting with st_intersects
?
The below code handles sfc
objects but I believe would work equally for sf
objects.
library(sf)
library(dplyr)
# create square
s <- rbind(c(1, 1), c(10, 1), c(10, 10), c(1, 10), c(1, 1)) %>% list %>% st_polygon %>% st_sfc
# create random points
p <- runif(50, 0, 11) %>% cbind(runif(50, 0, 11)) %>% st_multipoint %>% st_sfc %>% st_cast("POINT")
# intersect points and square with st_intersection
st_intersection(p, s)
# intersect points and square with st_intersects (courtesy of https://stackoverflow.com/a/49304723/7114709)
p[st_intersects(p, s) %>% lengths > 0,]
The answer is that in general the two methods do different things, though in your particular case (finding the intersection of a collection of points and a polygon), st_intersects
can be used to efficiently do the same job.
We can show the difference with a simple example modified from your own. We start with a square:
library(sf)
library(dplyr)
# create square
s <- rbind(c(1, 1), c(10, 1), c(10, 10), c(1, 10), c(1, 1)) %>%
list %>%
st_polygon %>%
st_sfc
plot(s)
Now we will create a rectangle and draw it on the same plot with a dotted outline:
# create rectangle
r <- rbind(c(-1, 2), c(11, 2), c(11, 4), c(-1, 4), c(-1, 2)) %>%
list %>%
st_polygon %>%
st_sfc
plot(r, add= TRUE, lty = 2)
Now we find the intersection of the two polygons and plot it in red:
# intersect points and square with st_intersection
i <- st_intersection(s, r)
plot(i, add = TRUE, lty = 2, col = "red")
When we examine the object i
, we will see it is a new polygon:
i
#> Geometry set for 1 feature
#> geometry type: POLYGON
#> dimension: XY
#> bbox: xmin: 1 ymin: 2 xmax: 10 ymax: 4
#> epsg (SRID): NA
#> proj4string: NA
#> POLYGON ((10 4, 10 2, 1 2, 1 4, 10 4))
Whereas, if we use st_intersects
, we only get a logical result telling us whether there is indeed an intersection between r
and s
. If we try to use this to subset r
to find the intersection, we don't get the intersected shape, we just get our original rectangle back:
r[which(unlist(st_intersects(s, r)) == 1)]
#> Geometry set for 1 feature
#> geometry type: POLYGON
#> dimension: XY
#> bbox: xmin: -1 ymin: 2 xmax: 11 ymax: 4
#> epsg (SRID): NA
#> proj4string: NA
#> POLYGON ((-1 2, 11 2, 11 4, -1 4, -1 2))
The situation that you have is different, because you are trying to find a subset of points that intersect a polygon. Is this case, the intersection of a group of points with a polygon is the same as the subset that meet the criterion st_intersects
.
So it is great that you have found a valid way of getting a quicker intersection. Just be aware this will only work with collections of points intersecting a polygon.