xslt

Creating a tree of XSD elements without duplicates


I'm building an XSD viewer using XSLT. The viewer should list all elements hierarchically, and I am currently stuck on deduplicating sub-elements which appear more than once in their parent (e.g. due to presence of choice or sequence clauses).

Here's my current bare bones transformation:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet
  version="3.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:map="http://www.w3.org/2005/xpath-functions/map"
  xmlns:array="http://www.w3.org/2005/xpath-functions/array"
  exclude-result-prefixes="#all"
>

  <xsl:output method="text" media-type="text/plain" omit-xml-declaration="yes"/>

  <xsl:template match="/">
    <xsl:apply-templates select="xs:schema/xs:element">
      <xsl:with-param name="parents" select="()"/>
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="xs:element">
    <xsl:param name="parents" as="xs:string*"/>

    <xsl:for-each select="1 to count($parents)">-</xsl:for-each>
    <xsl:value-of select="@name"/>
    <xsl:text>&#xa;</xsl:text>
    <xsl:apply-templates select="
      xs:element |
      xs:complexType |
      xs:sequence |
      xs:group[@ref] |
      xs:choice |
      //xs:complexType[@name=current()/@type]
    ">
      <xsl:with-param name="parents" select="$parents, @name"/>
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="xs:group[@ref]">
    <xsl:param name="parents" as="xs:string*"/>

    <xsl:apply-templates select="//xs:group[@name=current()/@ref]">
      <xsl:with-param name="parents" select="$parents"/>
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="xs:complexType | xs:sequence | xs:choice | xs:group[@name]">
    <xsl:param name="parents" as="xs:string*"/>

    <xsl:apply-templates select="xs:element | xs:complexType | xs:sequence | xs:group[@ref] | xs:choice">
      <xsl:with-param name="parents" select="$parents"/>
    </xsl:apply-templates>
  </xsl:template>

</xsl:stylesheet>

My main target is the MusicXML XSD. In this schema, there are many cases where a sub-element is repeated in its parent, like the chord element in the note. My goal is to only show chord once under note in my output. I discard from the output any non-xs:element element such as xs:group, xs:complexType, etc.

My idea is to discard the current xs:element if the same element with the same parents has already been seen. I tried various approaches with xsl:accumulator and carrying extra state in the xsl:apply-templates invocations, but so far no success. I would appreciate some fresh ideas here. Thanks!

EDIT: Running the above xsl script (renamed to supported.xsl) against the musicxml.xsd schema linked above using xslt3 produces the following output (truncated for brevity):

$ xslt3 -xsl:supported.xsl -s:musicxml.xsd
score-partwise
-work
[..]
-part
--measure
---note
----grace
----chord
----pitch
-----step
-----alter
-----octave
----unpitched
-----display-step
-----display-octave
----rest
-----display-step
-----display-octave
----tie
----cue
----chord
----pitch
-----step
-----alter
-----octave
----unpitched
-----display-step
-----display-octave
----rest
-----display-step
-----display-octave
----cue
[..]

In this output, you can see chord and other elements under note being repeated. I want to avoid repeating these elements when they have the same parent path (in this example, /score-partwise/part/measure/note.


Solution

  • Seems like a grouping problem to me, I would first build the hierarchy as XML and then group the children of each element by node-name():

    <xsl:stylesheet
      version="3.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      xmlns:map="http://www.w3.org/2005/xpath-functions/map"
      xmlns:array="http://www.w3.org/2005/xpath-functions/array"
      exclude-result-prefixes="#all"
    >
      
      <xsl:output method="text" media-type="text/plain" omit-xml-declaration="yes"/>
    
      <xsl:template match="/">
        <xsl:variable name="hierarchy" as="element()*">
          <xsl:apply-templates select="xs:schema/xs:element" mode="h">
            <xsl:with-param name="parents" select="()"/>
          </xsl:apply-templates>      
        </xsl:variable>
        <xsl:apply-templates select="$hierarchy"/>
      </xsl:template>
    
      <xsl:template match="*">
        <xsl:for-each select="1 to count(ancestor::*)">-</xsl:for-each>
        <xsl:value-of select="node-name()"/>
        <xsl:text>&#xa;</xsl:text>
        <xsl:for-each-group select="*" group-by="node-name()">
          <xsl:apply-templates select="."/>
        </xsl:for-each-group>
      </xsl:template>
    
      <xsl:mode name="h" on-no-match="shallow-skip"/>
    
      <xsl:template match="xs:group[@ref]" mode="h">
        <xsl:apply-templates select="//xs:group[@name=current()/@ref]" mode="#current"/>
      </xsl:template>
      
      <xsl:template match="xs:element" mode="h">
        <xsl:element name="{@name}">
          <xsl:apply-templates select="
            xs:element |
            xs:complexType |
            xs:sequence |
            xs:group[@ref] |
            xs:choice |
            //xs:complexType[@name=current()/@type]
          " mode="#current"/>      
        </xsl:element>
      </xsl:template>
    
      <xsl:template match="xs:complexType | xs:sequence | xs:choice | xs:group[@name]" mode="h">
        <xsl:apply-templates select="xs:element | xs:complexType | xs:sequence | xs:group[@ref] | xs:choice" mode="#current"/>
      </xsl:template>
    
    </xsl:stylesheet>