xmlf#xml-namespacestype-providers

Parsing an XML with namespaces using F# XMLProvider


I've been trying to parse an XML NLog message using XMLProvider. In my project I defined the provider in the following way:

XmlProvider<"""
    <log4j:event 
        logger="MyTest" level="INFO" timestamp="1597329370954" thread="1">
        <log4j:message>
            Running task
        </log4j:message>
        <log4j:locationInfo class="My.Task" method="Void main(System.String[])" />
        <nlog:eventSequenceNumber>
            11
        </nlog:eventSequenceNumber>
        <nlog:locationInfo assembly="My.Task, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" />
        <nlog:properties />
        <log4j:properties>
            <data name="log4japp" value="My.Task(64400)" />
            <data name="log4jmachinename" value="BNT" />
        </log4j:properties>
    </log4j:event>""">

But this doesn't compile, as I get exceptions about missing namespace definitions. So I added xmlns to the root element, making it:

XmlProvider<"""
    <log4j:event 
        xmlns:log4j="http://jakarta.apache.org/log4j/"
        xmlns:nlog="http://www.nlog-project.org/schemas/NLog.xsd"
        logger="MyTest" level="INFO" timestamp="1597329370954" thread="1">
        <log4j:message>
           ...
        ...
    </log4j:event>""">

And it helps my project to compile, but it throws an exception at runtime while attempting to parse an XML:

    'log4j' is an undeclared prefix. Line 1, position 2.

I got away with the problem by preprocessing the xml and removing all the xml namespaces from all the tags. While this works, I wonder if there's a better solution.


Solution

  • This works fine for me, but you need to include the XML namespace declarations in both the sample XML and also in the actual data. For example, the following works fine:

    // Simplified declaration with namespaces included
    type X = XmlProvider<"""
      <log4j:event xmlns:log4j="http://jakarta.apache.org/log4j/"
            xmlns:nlog="http://www.nlog-project.org/schemas/NLog.xsd">
          <log4j:message>Running task</log4j:message>
          <nlog:eventSequenceNumber>11</nlog:eventSequenceNumber>
      </log4j:event>""">
    
    // Load a sample data file with namespaces included
    let x = X.Parse("""
    <log4j:event xmlns:log4j="http://jakarta.apache.org/log4j/"
          xmlns:nlog="http://www.nlog-project.org/schemas/NLog.xsd">
        <log4j:message>Running task</log4j:message>
        <nlog:eventSequenceNumber>11</nlog:eventSequenceNumber>
    </log4j:event>""")
    
    // Returns the data from the XML just fine!
    x.Message
    x.EventSequenceNumber
    

    If you omit the XML namespace declarations from the data, this fails:

    let y = X.Parse("""
    <log4j:event>
        <log4j:message>Running task</log4j:message>
        <nlog:eventSequenceNumber>11</nlog:eventSequenceNumber>
    </log4j:event>""")
    

    I don't think there is a way to tell the XML type provider to implicitly include namespace declarations - I can see how this would be a useful feature, so feel free to make a suggestion on the project GitHub!