xmlxsltsaxoncomposite-keyxslt-3.0

Copy elements matching composite elements key within external document


Background

Looking to merge two XML documents using XSLT 3.1 and Saxon 10.5 HE.

Problem

Using a composite key never matches between the source document and the target document, thus the data never merges.

Code

There are a few source files involved: "schedule", "libraries", and "copyright".

copyright.xml

In the copyright XML document, there are multiple copyright elements, each having a unique dependency locator key. The locator is the composite key. When the XSLT runs, this document is provided as input. There are no guarantees that the unique key listed in this file have a match in the libraries file (most do, several don't).

<?xml version="1.0" encoding="UTF-8"?>
<copyrights>
  <copyright>
    <title>Apache Log4j™ 2 API</title>
    <year>2016</year>
    <dependency>
      <groupId>org.apache.logging.log4j</groupId>
      <artifactId>log4j-api</artifactId>
      <version>2.18.0</version>
    </dependency>
    <authors>
      <author>Author Name Common</author>
      <author>Author Name Copyright</author>
    </authors>
  </copyright>
</copyrights>

libraries.xml

Each entry in the libraries XML document has a unique dependency locator that exactly corresponds to one and only one entry in the copyright.xml file:

<?xml version="1.0"?>
<libraries>
  <library>
    <dependency>
      <groupId>org.apache.logging.log4j</groupId>
      <artifactId>log4j-api</artifactId>
      <version>2.18.0</version>
    </dependency>
    <licenses>
      <license>Apache-2.0</license>
    </licenses>
    <authors>
      <author>Author Name License</author>
      <author>Author Name Common</author>
    </authors>
  </library>
</libraries>

We're looking to combine the licenses and authors from this file into the copyright.xml file.

schedule.xsl

Here's one of the many attempts to try and merge the documents together:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.1">

  <xsl:output method="xml" encoding="UTF-8" />
  <xsl:strip-space elements="*" />

  <xsl:variable name="LIB"
    select="document( resolve-uri( 'libraries.xml', base-uri( / ) ) )" />

  <xsl:key
    name="locator-key"
    match="dependency"
    use="groupId, artifactId, version"
    composite="yes" />

  <!-- Identity transform without comments. -->
  <xsl:template match="@* | *">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="copyrights">
    <copyrights><xsl:apply-templates /></copyrights>
  </xsl:template>

  <xsl:template match="copyright">
    <copyright><xsl:apply-templates /></copyright>
  </xsl:template>

  <xsl:template match="dependency">
    <dependency>
      <xsl:copy-of select="key( 'locator-key', $LIB/libraries/library, . )" />
    </dependency>
  </xsl:template>
</xsl:stylesheet>

Output

The incorrect output resembles:

<copyrights>
   <copyright>
      <title>Apache Log4j™ 2 API</title>
      <year>2016</year>
      <dependency/>
      <authors>
         <author>Author Name Copyright</author>
      </authors>
   </copyright>
</copyrights>

The desired output, which pulls data from licenses.xml based on the unique locator key resembles:

<copyrights>
   <copyright>
    <dependency>
      <groupId>org.apache.logging.log4j</groupId>
      <artifactId>log4j-api</artifactId>
      <version>2.18.0</version>
    </dependency>
    <licenses>
      <license>Apache-2.0</license>
    </licenses>
    <authors>
      <author>Author Name Common</author>
      <author>Author Name Copyright</author>
      <author>Author Name License</author>
    </authors>
   </copyright>
</copyrights>

There is more complex logic to be added, but without being able to match on a composite key to cross-reference into the second document, no other data can be merged.

Question

How do you use a composite key to select elements from a separate document?


Solution

  • Using your copyright.xml document as the input and the following stylesheet:

    XSLT 3.0

    <xsl:stylesheet version="3.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <xsl:mode on-no-match="shallow-copy"/>
    
    <xsl:param name="lib">path/to/libraries.xml</xsl:param>
    
    <xsl:key name="locator-key" match="dependency" use="*" composite="yes" />
    
    <xsl:template match="authors">
        <xsl:copy>
            <xsl:copy-of select="*, key('locator-key', ../dependency/*, document($lib))/../authors/*"/>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>
    

    I get:

    Result

    <?xml version="1.0" encoding="UTF-8"?>
    <copyrights>
       <copyright>
          <title>Apache Log4j™ 2 API</title>
          <year>2016</year>
          <dependency>
             <groupId>org.apache.logging.log4j</groupId>
             <artifactId>log4j-api</artifactId>
             <version>2.18.0</version>
          </dependency>
          <authors>
             <author>Author Name Common</author>
             <author>Author Name Copyright</author>
             <author>Author Name License</author>
             <author>Author Name Common</author>
          </authors>
       </copyright>
    </copyrights>
    

    Hopefully this can get you on the right track.

    Note that my shortcut using dependency/* assumes that in both documents the dependency element contains the same number of child elements, in the same order (though not necessarily with same names).