javascriptxmldomxpathdom-traversal

Get the hierarchy of a XML element with XPath


I am trying to get the ordered list of the hierarchy of a given element in a "application/xml" response.data document that I parse using a DOM parser in Javascript. So the expression should return the list ['Grand Parent','Parent','Target'] for each A tag that has no A children. So I will get a list of lists where the last element of an inner list would be the deepest (in terms of graph depth) value of <A-title>. Thanks to @Jack Fleeting I know I can get the targets using the expression xpath below : xpath = '//*[local-name()="A"][not(.//*[local-name()="A"])]/*[local-name()="A-title"]' but I am not sure how to adapt it to get to the hierarchy list.

<A>
<A-title>Grand Parent</A-title>
   <A>
   <A-title>Parent</A-title>
      <A>
      <A-title>Target</A-title>
      </A>
   </A>
</A>

EDIT :

axios.get('WMS_URL').then((r) => {
      const parser = new DOMParser()
      const dom = parser.parseFromString(r.data, 'application/xml')
       let xpath = '//*[local-name()="A"][not(.//*[local-name()="A"])]/*[local-name()="A-title"]'
       let xpath2 = 'ancestor-or-self::A/A-title'
       var targets = dom.evaluate(xpath, dom, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)
       var targets2 = dom.evaluate(xpath2, targets, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)
       Array.from({ length: targets2.snapshotLength }, (_, index) => layerNames.push(targets2.snapshotItem(index).innerHTML))

Solution

  • If you use the XPath //A[not(A)]/ancestor-or-self::A/A-title you get with //A[not(A)] all A elements not having A children and the next step navigates to all ancestor or self A elements and last to all A-title children. Of course in XPath 1 with a single expression you can't construct a list of lists of strings (or elements?) so you would first need to sel3ect //A[not(A)] and then from there select the ancestor-or-self::A/A-title elements.

    Using XPath 3.1, for instance with Saxon JS 2 (https://www.saxonica.com/saxon-js/index.xmlm, https://www.saxonica.com/saxon-js/documentation/index.html), you could construct a sequence of arrays of strings directly e.g.

    //A[not(A)] ! array { ancestor-or-self::A/A-title/data() }
    

    The JavaScript code to evaluate the XPath would be e.g.

    let result = SaxonJS.XPath.evaluate('parse-xml($xml)//A[not(A)] ! array { ancestor-or-self::A/A-title/data() }', [], { params : { 'xml' : r.data }})
    

    With DOM Level 3 XPath 1.0 I think you need a lot of more lines of code:

    let xmlDoc = new DOMParser().parseFromString(r.data, 'application/xml');
    
    let leafAElements = xmlDoc.evaluate('//A[not(A)]', xmlDoc, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
    
    let result = [];
    
    for (let i = 0; i < leafAElements.snapshotLength; i++) { 
      let titleEls = xmlDoc.evaluate('ancestor-or-self::A/A-title', leafAElements.snapshotItem(i), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
      let titles = []; 
      for (let j = 0; j < titleEls.snapshotLength; j++) { 
        titles.push(titleEls.snapshotItem(j).textContent); 
      } 
      result.push(titles); 
    }