rggplot2gisr-sf

Merging polygons across the anti-meridian/date line (180/-180 degrees) in sf


I'm trying to use sf to plot a shapefile in a Pacific-centered map but since the shapefile was designed for a more "typical" Atlantic-centered projection, I'm getting lines through all the polygons at the anti-meridian (180º longitude). I've had this problem before, but was able to solve it then by finding a shapefile that was designed for the Pacific. In this case, that's not an option.

Is there some way to merge these polygons so that the lines disappear? I've tried st_union in various ways but the problem is these shapes are actually closed at that point and even when the thing is merged into a single polygon, the line persists. I feel like this has to be a problem other people have encountered.

How can I merge these things into a single polygon without the vertical line?

Here's some sample code using a subset of the data I'm working with:

library(tidyverse)
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE

regions <- st_read("https://gist.github.com/mhoban/ebdee566c4d1963e9fcd88fae818fc83#file-hawaii-geojson")
#> Reading layer `hawaii' from data source 
#>   `https://gist.github.com/mhoban/ebdee566c4d1963e9fcd88fae818fc83#file-hawaii-geojson' 
#>   using driver `GeoJSON'
#> Simple feature collection with 1 feature and 9 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -180 ymin: 13.36447 xmax: 180 ymax: 31.51548
#> Geodetic CRS:  WGS 84

# out of the box, polygons are split to either side of the map
ggplot(regions) + 
  geom_sf(fill = "lightgrey",linewidth=1)

Here you can see that plotting the shapefile directly shows it split on either side of the map:


# using a pacific-centered projection, the polygon is complete, 
# but there's a break at the anti-meridian
ggplot(regions %>% st_transform(3832)) + 
  geom_sf(fill = "lightgrey",linewidth=1)

I can use a Pacific-centered projection to get the complete shape, but there's still a vertical line at 180º:


# similarly, we can shift the longitudes
# but the problem persists
ggplot(regions %>% st_shift_longitude()) + 
  geom_sf(fill="lightgrey",linewidth=1)

I can also use st_shift_longitude, but I end up with the same thing:

Is there some way to dissolve these separate polygons into one? The problem (I think) is that they seem to be adjacent and not actually overlapping, so st_union keeps the boundary.

I see how it may get tricky because this is a MULTIPOLYGON with three pieces (the circle on the bottom plus the two pieces on top) rather than just two, but even if I have only the top two I can't figure out how to do it.


Solution

  • The issue is not with the anti-meridian per se. Your issue is because errors have been introduced somewhere during your workflow. I can only speculate where that occurred, but here is a fix.

    First, to illustrate the issue using your full dataset:

    library(sf)
    library(dplyr)
    library(ggplot2)
    
    # Load sf
    regions <- st_read("https://gist.github.com/mhoban/b85bce76410cd500c3828f5c9f8b81e1/raw/b19f104739dd20145cc3205cf2446e0ca6b0f6dc/marine_ecoregions.geojson")
    
    # Demonstrate error
    sf_error <- regions %>%
      st_make_valid() %>%
      st_shift_longitude()
    
    ggplot() +
      geom_sf(data = sf_error, aes(fill = REALM))
    

    1

    If you zoom in you will see that there are lines in the Eastern Indo-Pacific polygons at 180°, but not in the Central Indo-Pacific polygon. Also note that the line for the most southern of the two Eastern Indo-Pacific polygons (that crosses 180°) terminates within the polygon. This is a clear flag that something is amiss.

    The most robust approach would be to recreate your data from scratch, determine where the error occurred, and avoid it. However, if this is not possible, you can use something like sf::st_snap(). The snap tolerance in this solution, 165m, was found by trial and error.

    The downside is that any vertices within the snap tolerance can be snapped together, even if they are not problematic. But given the scale of your data, any effects will likely be minimal. If maximum accuracy is important, you could refine the following solution further by isolating the erroneous polygons and snapping them separately. This code will return a warning message which can be safely ignored in this use case.

    # Create Mollweide PROJ4 string with centred anti-meridian (for snapping)
    moll_prj <- "+proj=moll +lon_0=180 +datum=WGS84 +units=m +no_defs"
    
    # Convert to polygon geometries, correct geometry errors, project to PCS,
    # snap vertices, merge polygons by REALM, project back to original CRS
    sf_poly <- regions %>%
      st_cast("POLYGON") %>%
      st_make_valid() %>%
      st_transform(moll_prj) %>%
      st_snap(x = ., y = ., tolerance = 165) %>%
      summarise(geometry = st_union(geometry), .by = REALM) %>%
      st_transform(4326) %>%
      st_shift_longitude()
    # Warning message:
    # In st_cast.sf(., "POLYGON") :
    #   repeating attributes for all sub-geometries for which they may not be constant
    
    ggplot() +
      geom_sf(data = sf_poly, aes(fill = REALM))
    

    2