I have a polygon shapefile which was vectorized from a raster layer (second image). This shapefile contains thousands of features with just one column which represent polygon level (ranging 1-5). The file size is massive and so I was trying to use Shapely's simplify tool to reduce the file size. The aim is to get to a similar result to ArcGIS pro's simplify polygon tool, which looks great and reduced the file size by 80% (third image).
So far the Shapely tool had produced a pretty similar result but it's still not perfect because bordering polygons do not align great smoothly (see first image).
I assume I need to use some sort of smoothing or interpolation function to sort it, but I can't seem to find one that works.
the code I have so far:
from shapely.geometry import shape, mapping
from shapely.ops import transform
from shapely.validation import make_valid
import fiona
import pyproj
# Define the projections
project_to_utm = pyproj.Transformer.from_crs("EPSG:4326", "EPSG:32633", always_xy=True).transform
project_to_wgs84 = pyproj.Transformer.from_crs("EPSG:32633", "EPSG:4326", always_xy=True).transform
# Open your shapefile
with fiona.open(r"shapefile.shp", 'r') as source:
schema = source.schema
crs = source.crs
features = list(source)
# Reproject to UTM, simplify, then reproject back to WGS84
simplified_features = []
for feature in features:
# Reproject to UTM
geom = shape(feature['geometry'])
geom_utm = transform(project_to_utm, geom)
# Simplify in UTM
simplified_geom_utm = geom_utm.simplify(tolerance=0.5, preserve_topology=True)
# Fix any invalid geometries (self-intersections)
if not simplified_geom_utm.is_valid:
simplified_geom_utm = make_valid(simplified_geom_utm)
# Reproject back to WGS84
simplified_geom_wgs84 = transform(project_to_wgs84, simplified_geom_utm)
# Append the simplified and validated geometry
simplified_features.append({
'geometry': mapping(simplified_geom_wgs84),
'properties': feature['properties']
})
# Save the simplified polygons to a new shapefile
with fiona.open(r"shapfile_simplified.shp", 'w', driver='ESRI Shapefile', schema=schema, crs=crs) as output:
for feature in simplified_features:
output.write(feature)
The problem you are facing is that shapely at the moment only can simplify geometries one by one. Because of this, gaps and slivers can/will appear between adjacent polygons because different points might be removed on the adjacent borders of the polygons.
To avoid this, you need "topology-aware" simplification. This typically means that first the common boundaries are determined. Then these common boundary lines are simplified. Finally the polygons are reconstructed. This way the common boundaries will stay common without gaps and slivers.
A library that supports this way of simplifying is topojson, via the toposimplify function.
Because toposimplify requires all geometries to be in memory anyway, I used geopandas to load and save the geometries in the sample script as this will be easier and more efficient.
Sample code:
import geopandas as gpd
import topojson
input_path = r"shapefile.shp"
output_path = r"shapefile_simplified.shp"
# Read the shapefile
input_gdf = gpd.read_file(input_path)
# Reproject to UTM
input_gdf = input_gdf.to_crs(32633)
# Convert to topology, simplify and convert back to GeoDataFrame
topo = topojson.Topology(input_gdf, prequantize=False)
topo_simpl = topo.toposimplify(0.5)
simpl_gdf = topo_simpl.to_gdf()
# Fix any invalid geometries (self-intersections)
simpl_gdf.geometry = simpl_gdf.geometry.make_valid()
# Reproject back to WGS84
simpl_gdf = simpl_gdf.to_crs(4326)
# Write to output file
simpl_gdf.to_file(output_path)