pythonopenstreetmappyosmium

pyosmium - Build a Geojson linestring based on OSM Relation


I have a python script to analyse OSM data, and the objective is to build a GeoJson with specific data issued from OSM relation.

I'm currently focusing on OSM relation that represents 'hiking' trail like this one.

According to the document:

members
(read-only) Ordered list of relation members. See osmium.osm.RelationMemberList.

the relation object has an attribute members which collects all members of the relation.

Hence The first part of the script manages to extract all relation that have a tag sac_scale=hiking and collects all its ways.

The following script is on purpose focusing only on 1 specific relation : r104369

class HikingWaysfromRelations(osmium.SimpleHandler):
    def __init__(self):
        super(HikingWaysfromRelations, self).__init__()
        self.dict = {}

    def _getWays(self, elem, elem_type):
        # tag
        if 'sac_scale' in elem.tags and elem.tags['sac_scale']=='hiking' and elem.id==104369:

            list=[]
            for mem in elem.members:
                if mem.type=="w":
                    list.append(str(mem.ref))
            self.dict["r"+str(elem.id)]=list
        else:
            pass
    
    def relation(self,r):
        self._getWays(r, "relation")

ml = HikingWaysfromRelations()
ml.apply_file('../pbf/new-zealand.osm.pbf')

The result is a dictionary containing the expected relation as the only key, and its ways:

{"r104369": ["191668175", "765285136", "765285135", "765285138", "765285139", "191668225", "765542429", "765542430", "765542432", "765542431", "765542435", "765542436", "765542434", "765542433", "765542437", "765542438", "765542439", "765542440", "765542441", "765542442", "765548983", "271277386", "765548985", "765548984", "684295241", "684295239", "464603363", "464603364", "464607430", "299788481", "178920047", "155711655", "155711646", "684294192", "259362037", "684294189", "259362038", "259362041", "259362036", "259362043", "259362039", "259362040"]}

Now the question is: How to build a GeoJson containing a single Feature MultiLineString that connects all those ways and rebuild the expected hiking trail?

enter image description here

Based on what I've found on the net, I should re-run a simpleHander on the full .pbf file, and each time I encounter a way I'm looking for - based on the values of the above dictionary - I could reconstruct a LineString with:

import shapely.wkb as wkblib

wkbfab = osmium.geom.WKBFactory()

def getRelationGeometry(elem):
    wkb=wkbfab.create_linestring(elem)
    return wkb

The issue is that it looks like some ways have only 1 node, hence triggering following error:

RuntimeError: need at least two points for linestring (way_id=155711655)

So what would be the solution to re-build a GeoJson feature - multiLineString - of multiple ways, to be able to plot the result on https://geojson.io/#map=2/20.0/0.0 ?

How for instance openstreetmap manages to re-build the track of a relation when I hit link if not by connecting all nodes (from all ways) issued from the relation ?

Thanks a lot for your help

I know there is way with bash, where you first filter the initial pbf by keeping only relation with the tag sac_scale=hiking, and then transforming this filtered result to GeoJson - but I really want to be able to generate the same with python to understand how OSM data are stored. I just can't figure out an easy way to do so, knowing pyosmium is equivalent (supposedly) to osmium, I believe there should be an easy way there too

osmium export output/output_food-drinks.pbf -f geojson

Solution

  • Looking at the way with the id shown in the error in your post (155711655), it has two nodes, not one. Visible here as of the time of this answer.

    Knowing that, I can think of two reasons why you would get that error:

    1. You're not passing in the argument location=True to the apply_file method as suggested by the documentation:

      Because of the way that OSM data is structured, osmium needs to internally cache node geometries, when the handler wants to process the geometries of ways and areas. The apply_file() method cannot deduce by itself if this cache is needed. Therefore locations need to be explicitly enabled by setting the locations parameter to True:

      h.apply_file("test.osm.pbf", locations=True, idx='flex_mem')

      Looking at your code above, the apply_file method only has the input file as an argument, so I think this is likely your problem.

    2. The way may be referencing a node that is missing in your pbf extract. This is simple to verify with the osmium cli tool:

      osmium check-refs <your pbf file>
      

      This is the result I get from running that on a valid pbf of my own

      There are 6702885 nodes, 431747 ways, and 2474 relations in this file.
      Nodes in ways missing: 0
      

      Note the Nodes in ways missing: 0.