rr-sf

Dissolving polygon features with the sf package


Dissolve is a common geoproccessing technique discussed as an sf approach here.

I'm trying to replicate dissolve as it functions in ArcGIS. Consider counties by two groups in ArcGIS.

The ArcGIS dissolve command yields two polygons, regardless of the fact that the eastern peninsula consists of additional separate polygons. Like so:

This is the functionality I'd like to replicate in sf, however I cannot as demonstrated below.

nc <- st_read(system.file("shape/nc.shp", package="sf"))

#create two homogenous spatial groups
nc$group <- ifelse(nc$CNTY_ <= 1980,1,2)

#plot
ggplot() + geom_sf(data=nc, aes(fill = factor(group)))  

#dissolve
library(dplyr)#the summarize function is based on the one from dplyr (which may interfere with summarize from other libraries that may be loaded)
nc_dissolve <- nc %>% group_by(group) %>% summarize() 

#plot dissolved
ggplot() + geom_sf(data=nc_dissolve, aes(fill = factor(group)))

#Cartographically, it looks like we have two polygons, but there are 
#actually several more wrapped up as MULTIPOLYGONS. We can plot these.
t <- nc_dissolve %>% st_cast() %>% st_cast("POLYGON")
ggplot() + geom_sf(data=t, aes(fill=factor(row.names(t))))

Notice the peninsula has multiple extraneous polygons.

How do I wind up with just two as in the ArcGIS case? Many thanks.


Solution

  • I am not too familiar with how ArcGIS defines a polygon, but the simple feature access (an ISO standard) specification of a polygon is a single ring with zero or more inner rings denoting holes. This means that under that specification, if you have the main land + a couple of islands, you don't have a single polygon. To represent these as a single feature, the corresponding geometry type is multipolygon. Meaning your answer is in nc_dissolve: it has two features.