saxon-js

Can you provide collection finders in Saxon-Js, or am I barking up the wrong tree?


I'm attempting to run saxon-js on the command line to apply XSL 3 transformations currently working in another system with Saxon HE, because saxon-js looks like it can offer a lot more versatility.

I am essentially brand new to XSL so the learning curve is steep.

The error on which I am currently stuck is this:

Transformation failure: Error FODC0002 at iati.xslt#90 Unknown collection (no collectionFinder supplied)

The snippet of XSLT which triggers this is:

  <xsl:variable name="iati-codelists">
    <codes version="2.03">
      <xsl:apply-templates select="collection('../lib/schemata/2.03/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
      <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
    </codes>
  </xsl:variable>

This intends to go to that directory and sweep up a collection of .xml files.

Looking at the saxon-js docs, I see no option to provide a collection finder.

Is this something implemented in Saxon HE (which is presently doing the work) and not currently in Saxon-Js? Or am I barking up a different but equally wrong tree?

Thanks!


Solution

  • Yes you can. However the collectionFinder doesn't seem to work asynchronously so it's not super useful if you are writing an asynchronous application.

    I was able to get this to work with some hardcoding of the paths provided in the collections() in my node app as a proof of concept. Definitely is a better way to do this.

    If this is your XML

    <xsl:variable name="iati-codelists">
        <codes version="2.03">
          <xsl:apply-templates select="collection('../lib/schemata/2.03/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
          <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
        </codes>
        <codes version="2.02">
          <xsl:apply-templates select="collection('../lib/schemata/2.02/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
          <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
        </codes>
        <codes version="2.01">
          <xsl:apply-templates select="collection('../lib/schemata/2.01/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
          <xsl:apply-templates select="collection('../lib/schemata/non-embedded-codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
        </codes>
        <codes version="1.05">
          <xsl:apply-templates select="collection('../lib/schemata/1.05/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
        </codes>
        <codes version="1.04">
          <xsl:apply-templates select="collection('../lib/schemata/1.04/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
        </codes>
        <codes version="1.03">
          <xsl:apply-templates select="collection('../lib/schemata/1.03/codelist/?select=*.xml;recurse=yes')" mode="get-codelists"/>
        </codes>
      </xsl:variable>
    

    Before the transform runs this code creates an object of the collections. The keys are a part of the file path to the codelist directories which contain a series of xml docs. The values are Arrays of the xml documents converted into the format Saxon JS needs with SaxonJS.getResource(). It got a little tricky with having the Promises inside the object so I had to use Lodash.

    const _ = require('lodash');
    const fs = require('fs');
    const fsPromises = fs.promises;
    
    const SaxonJS = require('saxon-js');
    
    // load codelists since collectionFinder can't be async
    let codelistPaths = [
        "non-embedded-codelist/",
        "2.03/codelist/",
        "2.02/codelist/",
        "2.01/codelist/",
        "1.05/codelist/",
        "1.04/codelist/",
        "1.03/codelist/"
    ];
                    
    // this returns an object of the codelistPaths as Keys and an Array of resolved promises for the Values. these promises are grabbing the codelist XML files using SaxonJS.getResource
    let resources = _.zipObject(codelistPaths, await Promise.all(_.map(codelistPaths, async (path) => {
        let files = await fsPromises.readdir("./IATI-Rulesets/lib/schemata/" + path);
        return await Promise.all(files.map(async (file) => {
            return await SaxonJS.getResource({ type : 'xml', file : "./IATI-Rulesets/lib/schemata/" +  path + file })
        }))
    })))         
    
    // this pulls the right array of SaxonJS resources from the resources object
    const collectionFinder = (url) => {
        if (url.includes("codelist")) {
            let path = url.split('schemata/')[1].split('?')[0]; // get the right filepath (remove file:// and after the ?
            return resources[path]
        } else {
            return []
        }
    }
    
    // Applying the XSLT3 Ruleset to IATI Files Using SaxonJS
    let results = await SaxonJS.transform({
        stylesheetFileName: "path to your .sef.json",
        sourceFileName: "path to your input .xml",
        destination: "serialized",
        collectionFinder: collectionFinder
    }, "async")

    Full details in the Saxon-JS support forum: https://saxonica.plan.io/issues/4797?pn=1#change-16579