xmlxsltxslt-2.0xinclude

Can I insert a default xi:fallback instance before processing an XInclude?


Say I have a source XML document that uses XIncludes, like this:

<?xml version="1.0" encoding="UTF-8"?>
<parent xmlns:xi="http://www.w3.org/2001/XInclude" xml:id="parent01">
   <xi:include href="child01.xml"/>
   <xi:include href="child02.xml"/>
   <xi:include href="child03.xml"/>
</parent>

The three further XML documents it calls in XIncludes look like this:

child01.xml:

<?xml version="1.0" encoding="UTF-8"?>
<children>
   <child xml:id="child01">
      <p>This is child 1.</p>
   </child>
</children>

child02.xml:

<?xml version="1.0" encoding="UTF-8"?>
<children>
   <child xml:id="child02">
      <p>This is child 2.</p>
   </child>
</children>

child03.xml:

<?xml version="1.0" encoding="UTF-8"?>
<children>
   <child xml:id="child03">
      <p>This is child 3.</p>
   </child>
</children>

I have an XSLT 2.0 transform like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
   <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

   <xsl:strip-space elements="*"/>

   <xsl:template match="/">
      <xsl:apply-templates select="parent"/>
   </xsl:template>

   <xsl:template match="parent">
      <volume>
         <xsl:apply-templates select="@*|.//child"/>
      </volume>
   </xsl:template>

   <xsl:template match="child">
      <chapter>
         <xsl:apply-templates select="@*|*|text()"/>
      </chapter>
   </xsl:template>

   <xsl:template match="@*|*|text()">
      <xsl:copy copy-namespaces="no">
         <xsl:apply-templates select="@*|*|text()"/>
      </xsl:copy>
   </xsl:template>
</xsl:stylesheet>

When all the files referenced by XIncludes are present in the same folder as parent01.xml, my transform works just fine, and produces this output:

<?xml version="1.0" encoding="UTF-8"?>
<volume xml:id="parent01">
   <chapter xml:id="child01">
      <p>This is child 1.</p>
   </chapter>
   <chapter xml:id="child02">
      <p>This is child 2.</p>
   </chapter>
   <chapter xml:id="child03">
      <p>This is child 3.</p>
   </chapter>
</volume>

However, if one file – say, child02.xml – is missing, the transform fails.

This failure would have been prevented if parent01.xml had included xi:fallback elements, like this:

<?xml version="1.0" encoding="UTF-8"?>
<parent xmlns:xi="http://www.w3.org/2001/XInclude" xml:id="parent01">
   <xi:include href="child01.xml">
      <xi:fallback>
         <child>
            <p>The file is missing.</p>
         </child>
      </xi:fallback>
   </xi:include>
   <xi:include href="child02.xml">
      <xi:fallback>
         <child>
            <p>The file is missing.</p>
         </child>
      </xi:fallback>
   </xi:include>
   <xi:include href="child03.xml">
      <xi:fallback>
         <child>
            <p>The file is missing.</p>
         </child>
      </xi:fallback>
   </xi:include>
</parent>

Then, the output would have been as follows:

<?xml version="1.0" encoding="UTF-8"?>
<volume xml:id="parent01">
   <chapter xml:id="child01">
      <p>This is child 1.</p>
   </chapter>
   <chapter>
      <p>The file is missing.</p>
   </chapter>
   <chapter xml:id="child03">
      <p>This is child 3.</p>
   </chapter>
</volume>

My question is this: is it possible to write my XSLT transform to insert an instance of xi:fallback into every xi:include before processing the XInclude – that is, to add a default xi:fallback instance where none is present, and then process the XInclude as if that xi:fallback instance had been present?

Thank you for any advice anyone can offer.


Solution

  • Expanding my comment to a full answer since this is an interesting question!

    XSLT transformations do not operate directly on the text content of XML documents, but on a tree-like representation of the content (DOM, XDM). This representation or model of the input is provided by an XML parser which, in theory, could be completely independent of the XSLT processor.

    Now the important bit is: the XML parser is in charge of performing XInclusions, not the XSLT processor. Once the XSLT processor gets to see the document model, there is no way of knowing if XInclusions have taken place. And no, to my knowledge there is no way to access the document tree both before and after XInclude in a single XSLT transformation step. You can process the same input nodes twice in a different mode, but you would also need to be able to control the XInclude feature of the XML parser from within the XSLT transformation, which is not possible.

    I suggest you take a little detour and solve your problem in two steps: Write an XSLT transformation that you apply without XInclude (deliberately turning this off in the XML parser preferences of your XML IDE like Oxygen or on the command line) to fix the missing fallbacks:

    XSLT to Fix Fallbacks

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xi="http://www.w3.org/2001/XInclude"
        version="2.0">
    
        <xsl:output method="xml" indent="yes"/>
    
        <xsl:template match="@*|node()">
            <xsl:copy>
               <xsl:apply-templates select="@*|node()"/> 
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="xi:include[not(xi:fallback)]">
            <xsl:copy>
                <xsl:apply-templates select="@*"/>
                <xi:fallback>
                    <child>
                        <p>The file is missing.</p>
                    </child>
                </xi:fallback>
                <xsl:apply-templates/>
            </xsl:copy>
        </xsl:template>
    
    </xsl:stylesheet>
    

    After that, the temporary output file will look like:

    <?xml version="1.0" encoding="UTF-8"?>
    <parent xmlns:xi="http://www.w3.org/2001/XInclude" xml:id="parent01">
        <xi:include href="child01.xml">
            <xi:fallback>
                <child>
                    <p>The file is missing.</p>
                </child>
            </xi:fallback>
        </xi:include>
        <xi:include href="child02.xml">
            <xi:fallback>
                <child>
                    <p>The file is missing.</p>
                </child>
            </xi:fallback>
        </xi:include>
        <xi:include href="child03.xml">
            <xi:fallback>
                <child>
                    <p>The file is missing.</p>
                </child>
            </xi:fallback>
        </xi:include>
    </parent>
    

    Then apply the second transformation that you already have, but turn on XInclude again before you do. This way, a missing file is replaced with its fallback content before the second transformation takes place.


    If that's unacceptable for you, you could look into XIPr, an XInclude processor written purely in XSLT 2.0 by Erik Wilde. By importing the XIPr stylesheets into your original XSLT stylesheets you could first provide the missing fallbacks as I've shown you above and then process the result with mode="xipr". In this case, you should disable any other XInclude processing by your IDE or command line tool.

    Here is how you could do it (yes, it gets kind of complicated):

    First of all, the href attributes pointing to your files would have to be absolute, because of a peculiarity of th XIPr processor:

    XML Input

    <?xml version="1.0" encoding="UTF-8"?>
    <parent xmlns:xi="http://www.w3.org/2001/XInclude" xml:id="parent01">
        <xi:include href="file:/Users/User/Desktop/child01.xml"/>
        <xi:include href="file:/Users/User/Desktop/child02.xml"/>
        <xi:include href="file:/Users/User/Desktop/child03.xml"/>
    </parent>
    

    XSLT Stylesheet

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
        xmlns:xi="http://www.w3.org/2001/XInclude"
        exclude-result-prefixes="xi">
    
        <xsl:import href="xipr.xsl"/>
    
        <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
        <xsl:strip-space elements="*"/>
    
        <xsl:template match="/">
            <xsl:variable name="fixedfallbacks">
                <xsl:apply-templates select="." mode="fixfallbacks"/>
            </xsl:variable>
            <xsl:variable name="xincluded">
                <xsl:apply-templates select="$fixedfallbacks" mode="xipr"/>
            </xsl:variable>
            <xsl:apply-templates select="$xincluded/*" mode="#default"/>
        </xsl:template>
    
        <xsl:template match="xi:include[not(xi:fallback)]" mode="fixfallbacks">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()" mode="fixfallbacks"/>
                <xi:fallback>
                    <child>
                        <p>The file is missing.</p>
                    </child>
                </xi:fallback>
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="@*|node()" mode="fixfallbacks">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()" mode="fixfallbacks"/> 
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="parent">
            <volume>
                <xsl:apply-templates select="@*|.//child"/>
            </volume>
        </xsl:template>
    
        <xsl:template match="child">
            <chapter>
                <xsl:apply-templates select="@*|*|text()"/>
            </chapter>
        </xsl:template>
    
        <xsl:template match="@*|*|text()">
            <xsl:copy copy-namespaces="no">
                <xsl:apply-templates select="@*|*|text()"/>
            </xsl:copy>
        </xsl:template>
    
    </xsl:stylesheet>
    

    Before executing this, you have to turn off the XInclude option of your XML parser, download the XIPr stylesheet here, open it and replace the line 52

    <xsl:variable name="include-uri" select="resolve-uri(@href, document-uri(/))"/>
    

    with

    <xsl:variable name="include-uri" select="resolve-uri(@href)"/>
    

    You have to do this because you are asking XIPr to XInclude an intermediate result that is a temporary tree. If you use document-uri(/) on such a tree, it will return an empty sequence, which is not allowed as the second argument of resolve-uri().

    Now, finally, if one of the files does not exist, the result will be

    Final XML Output

    <?xml version="1.0" encoding="UTF-8"?>
    <volume xml:id="parent01">
       <chapter>
          <p>The file is missing.</p>
       </chapter>
       <chapter xml:id="child02">
          <p>This is child 2.</p>
       </chapter>
       <chapter xml:id="child03">
          <p>This is child 3.</p>
       </chapter>
    </volume>