I have two spatial data sets -- one with a polygon of California state waters (download here), and one with polygons of National Marine Sanctuaries (download Greater Farallones NMS for example below).
I want to cover the California state waters polygon with national marine sanctuaries polygons, so that when I query a SpatVector with points, the points are assigned to either state waters or a specific national marine sanctuary. But when I cover state waters with marine sanctuaries polygons, the coastlines of the two data sets don't line up, and I end up with a thin stretch of state waters between the marine sanctuaries and the shoreline. As a result, points that should be classified as in marine sanctuaries are classified as in state waters.
This is a (hopefully) reproducible example:
library(terra)
## data: state waters
ca_waters <- terra::vect('sg211gq3741.shp')
plot(ca_waters)
ca_waters
## data: one of the NMS polygons (Greater Farallones)
gf <- vect('gfnms_py.shp')
plot(gf)
gf
## match CRS
crs(gf) == crs(ca_waters)
gf <- project(gf, ca_waters)
crs(gf) == crs(ca_waters)
## crop ca waters to a smaller area
cw2 <- crop(ca_waters, extend(ext(gf), 1))
## plot gfnms over ca waters -- can see some of
plot(cw2, border='red')
polys(gf, 'gray',border='gray')
## get a better look at the difference in polygon extent near the coastline
e <- ext(-122.9151,-122.566,37.8552,38.0326)
plot(crop(cw2,e), border='red')
polys(crop(gf,e), 'gray',border='gray')
## cover state waters with NMS
cw2 <- terra::cover(cw2, gf)
cw2$sourcethm[2] <- 'GFNMS_py.shp'
plot(cw2, 'sourcethm')
## repeat the plot above now that both polygons are in the same spatvector...
e <- ext(-122.9151,-122.566,37.8552,38.0326)
plot(crop(cw2,e), 'sourcethm')
Is this a projection issue? If not, how can I cover the thin stretch of state waters polygon between the marine sanctuary and the shoreline? I've used terra::buffer to expand the marine sanctuary polygons and cover the remaining state waters coastline, but this also covers part of the state waters polygon that I do want (at the true offshore boundary above and below the marine sanctuary polygon)...
The dataset developers took different decisions about what are coastal waters for estuaries and river mouths. The marine reserve data has less detail. For example, it does not include Drake's Estuary. So it is not so much a projection issue, but simply data that does not perfectly align.
Read the data directly:
library(terra)
url <- "https://geowebservices.stanford.edu/geoserver/wfs?request=GetFeature&service=wfs&srsName=EPSG%3A4326&typeName=druid%3Asg211gq3741&version=2.0.0"
ca_waters <- sf::st_read(url) |> sf::st_cast("GEOMETRYCOLLECTION") |>
sf::st_collection_extract("POLYGON") |> terra::vect()
gf <- vect("/vsizip/vsicurl/https://sanctuaries.noaa.gov/library/imast/gfnms_py2.zip/GFNMS_py.shp")
# project and subset
gf <- project(gf, ca_waters)
cw2 <- crop(ca_waters, ext(gf)+1)
I erase from the coastal waters the areas that are covered by the reserves. I find small remaining coastal water areas that are not included in the marine reserve data, and I combine these with the marine reserve data.
cw <- erase(cw2, gf) |> disagg()
cw$area <- expanse(cw)
cw_small <- cw[cw$area < 50000000, ]
cw_new <- cw[cw$area > 50000000, ]
gf_new <- aggregate(rbind(gf, cw_small))
plot(cw_new, border="red")
lines(gf_new, col="blue")
See terra::combineGeoms
for a more generic approach to combine small polygon pieces with larger neighboring polygons.