pythonosmnx

How to remove the ocean with OpenStreetMap?


I'm working with the OpenStreetMap library (osmnx, in python) data to extract the boundary of Salvador, Brazil. The code successfully removes water bodies and gets the mainland and the islands (i'm leaving them out for simplicity), but there's an issue with an artificial triangular boundary that extends into the ocean.

This triangle is generated by part of the administrative boundary that for some reason merges with the natural coastline and isn't subtracted by the ocean. How would you try to remove this artificial triangular boundary in OpenStreetMap? Is there some other filter I should be trying? I want to find a more general solution than doing "debuff, buff" on this manually. I tried using the coastline, but the results weren't great.

import osmnx as ox
import shapely

def get_city_boundary(place_name):
    city_gdf = ox.geocode_to_gdf(place_name)
    city_polygon = city_gdf.loc[0, "geometry"]

    expand = 0
    minx, miny, maxx, maxy = city_polygon.bounds
    bounding_box = shapely.geometry.box(minx - expand, miny - expand, maxx + expand, maxy + expand)

    try:
        water_tags = {"natural": ["bay"], "maritime": "yes"}
        water_gdf = ox.features_from_polygon(bounding_box, tags=water_tags)
        if not water_gdf.empty:
            water_union = water_gdf.geometry.union_all()
            city_polygon = city_polygon.difference(water_union)
    except Exception as e:
        print(f"Error processing water areas: {e}")

    city_polygon = city_polygon.buffer(0)
### change around here a bit if you want the islands of the city to appear
    if isinstance(city_polygon, shapely.geometry.MultiPolygon):
        polygons = list(city_polygon.geoms)
        city_polygon = max(polygons, key=lambda p: p.area)

    return city_polygon

place_name = "Salvador, Bahia, Brazil"
processed_polygon = get_city_boundary(place_name)

processed_gdf = ox.geocode_to_gdf(place_name).copy()
processed_gdf.loc[0, "geometry"] = processed_polygon

if processed_gdf.crs is None:
    processed_gdf.set_crs(epsg=4326, inplace=True)

output_file = "salvador_processed_boundary.shp"
processed_gdf.to_file(output_file) 

Shapefile of Salvador, Bahia, Brazil removing the ocean


Solution

  • OpenStreetMap maps coastlines rather than water, so the task isn't as straightforward as "download the political boundary, then subtract the water from it."

    Instead, the workflow is to get the political boundary and coastlines as linestrings, polygonize them, then drop the water polygon. This should give you a much simpler and more generalizable solution.

    import osmnx as ox
    import geopandas as gpd
    
    # get all municipalities and coastlines within Salvador
    place = "Salvador, Bahía, Brasil"
    tags = {"natural": "coastline", "place": "municipality"}
    gdf = ox.features.features_from_place(place, tags)
    
    # project and filter results to keep only Salvador municipality and coastlines
    mask1 = gdf["name"] == "Salvador"
    mask2 = gdf["natural"] == "coastline"
    gdf = ox.projection.project_gdf(gdf.loc[mask1 | mask2])
    
    # replace Salvador municipal boundary polygon with boundary linestring
    gdf.loc[mask1, "geometry"] = gdf.loc[mask1, "geometry"].boundary
    
    # polygonize the linestrings, drop the largest polygon (open water), then plot
    gdf_land = gpd.GeoDataFrame(geometry=gdf.polygonize(), crs=gdf.crs)
    ax = gdf_land.drop(gdf_land.area.idxmax()).plot()
    

    Salvador boundaries via OSMnx