I want to read a xml file generated by komoot into a DataFrame. Here is the structure of the xml file:
<?xml version='1.0' encoding='UTF-8'?>
<gpx version="1.1" creator="https://www.komoot.de" xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd">
<metadata>
<name>Title</name>
<author>
<link href="https://www.komoot.de">
<text>komoot</text>
<type>text/html</type>
</link>
</author>
</metadata>
<trk>
<name>Title</name>
<trkseg>
<trkpt lat="60.126749" lon="4.250254">
<ele>455.735013</ele>
<time>2023-08-20T17:42:34.674Z</time>
</trkpt>
<trkpt lat="60.126580" lon="4.250247">
<ele>455.735013</ele>
<time>2023-08-20T17:42:36.695Z</time>
</trkpt>
<trkpt lat="60.126484" lon="4.250240">
<ele>455.735013</ele>
<time>2023-08-20T17:44:15.112Z</time>
</trkpt>
</trkseg>
</trk>
</gpx>
I tried this code:
pd.read_xml('testfile.gpx',xpath='./gpx/trk/trkseg')
But somehow it seems there are problems with my xpath
. Namely, I get this ValueError
:
ValueError: xpath does not return any nodes. Be sure row level nodes are in xpath. If document uses namespaces denoted with xmlns, be sure to define namespaces and use them in xpath.
I tried a lot but no xpath I chose worked out.
Following the ValueError
guidelines, you need to pass a namespace
to read_xml
:
df = (
pd.read_xml(
"testfile.gpx",
xpath=".//doc:trkseg/doc:trkpt",
namespaces={"doc": "http://www.topografix.com/GPX/1/1"}
)
)
Output :
print(df)
lat lon ele time
0 60.126749 4.250254 455.735013 2023-08-20T17:42:34.674Z
1 60.126580 4.250247 455.735013 2023-08-20T17:42:36.695Z
2 60.126484 4.250240 455.735013 2023-08-20T17:44:15.112Z