sqlsql-serverxml-namespacesgpx

SQL Server - parse GPX file


I'm trying to parse a GPX file within SQL Server 2019, but I'm hitting a snag with namespaces, I think.

From what I can see - if the GPX file contains :

xmlns="http://www.topografix.com/GPX/1/1"

SQL returns a NULL. But if I remove that from the GPX file, the SQL returns a string of coords - as expected.

SQL code :

DECLARE @XML TABLE (XML_COLUMN XML)

DECLARE @sqlstmt NVARCHAR(255)
DECLARE @file NVARCHAR(255) = 'd:\demo_2.gpx'

SET @sqlstmt= 'SELECT * FROM OPENROWSET (BULK ''' + @file + ''', SINGLE_CLOB) AS xmlData'

INSERT INTO @XML 
    EXEC (@sqlstmt)

;WITH XMLNAMESPACES ('http://www.topografix.com/GPX/1/1' AS ns), X_CTE AS
(
    SELECT
        T1.Name.query('.') AS Name,
        T2.X_Content.query('.') AS X_Content
    FROM   
        @XML AS X
    CROSS APPLY 
        XML_Column.nodes('/gpx/trk') AS T1(Name)
    CROSS APPLY 
        XML_Column.nodes('/gpx/trk/trkseg/trkpt') AS T2(X_Content)
),
XML_Data AS
(
    SELECT
        X_Content.value('(/trkpt/@lat)[1]', 'VARCHAR(50)') AS LAT,
        X_Content.value('(/trkpt/@lon)[1]', 'VARCHAR(50)') AS LON
    FROM 
        X_CTE
)
SELECT 
    STUFF((SELECT '[' + LON + ',' + LAT + ']' + ','
           FROM XML_Data
           WHERE 1 = 1
           FOR XML PATH('')), 1, 0, '') AS mapString;

GPX file content (demo_2.gpx)

<?xml version="1.0" encoding="UTF-8"?>
<gpx creator="Garmin Connect" version="1.1"
  xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/11.xsd"
  xmlns:ns3="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
  xmlns="http://www.topografix.com/GPX/1/1"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns2="http://www.garmin.com/xmlschemas/GpxExtensions/v3">
  <metadata>
    <link href="connect.garmin.com">
      <text>Garmin Connect</text>
    </link>
    <time>2022-05-29T08:37:21.000Z</time>
  </metadata>
  <trk>
    <name>My Route</name>
    <type>e_bike_mountain</type>
    <trkseg>
      <trkpt lat="54.37033147551119327545166015625" lon="-3.075514398515224456787109375">
        <ele>65.8000030517578125</ele>
        <time>2022-05-29T11:37:02.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:atemp>17.0</ns3:atemp>
            <ns3:hr>155</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
      <trkpt lat="54.37033147551119327545166015625" lon="-3.075514398515224456787109375">
        <ele>65.8000030517578125</ele>
        <time>2022-05-29T11:37:03.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:atemp>17.0</ns3:atemp>
            <ns3:hr>155</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
    </trkseg>
  </trk>
</gpx>

Really pulling the last bits of remaining hair out with this one, if anyone can assist, that would be totally awesome!


Solution

  • Your XML defines a default namespace, that is applied to all XML nodes - as long as no other namespace is defined explicitly, by means of a prefix:

    <gpx creator="Garmin Connect" version="1.1"
         ...
         xmlns="http://www.topografix.com/GPX/1/1"
    

    The xmlns= declaration, without an alias (like xmlns:ns=...), is the default XML namespace for your XML document.

    Now if you actually define your XML namespace in the query, with an alias ns like so:

    ;WITH XMLNAMESPACES ('http://www.topografix.com/GPX/1/1' AS ns)
    

    then you must also use that alias in all your relevant XPath queries:

    SELECT
        ...
    FROM   
        @XML AS X
    CROSS APPLY 
        XML_Column.nodes('/ns:gpx/ns:trk') AS T1(Name)
    CROSS APPLY 
        XML_Column.nodes('/ns:gpx/ns:trk/ns:trkseg/ns:trkpt') AS T2(X_Content)
    

    Alternatively, and much simpler - define the XML namespace as your default namespace in the XQuery in T-SQL; then you do not need to apply the namespace alias everywhere:

    ;WITH XMLNAMESPACES (DEFAULT 'http://www.topografix.com/GPX/1/1')
    

    And in the end - you could write your XQuery much simpler - try this:

    WITH XMLNAMESPACES (DEFAULT 'http://www.topografix.com/GPX/1/1')
        SELECT
            LAT = XC.value('@lat', 'VARCHAR(50)'),
            LON = XC.value('@lon', 'VARCHAR(50)')
        FROM   
            @Xml AS X
        CROSS APPLY 
            XML_Column.nodes('/gpx/trk/trkseg/trkpt') AS XT(XC)
    

    This should return the same value - with much less code and indirections....