xmlxslthtml-listshtml-heading

Make in XSLT a TOC of the heading nodes inside an ul li structure


General overview

I try to make the table of content (TOC) of a document by picking only his heading nodes like h1, h2… h9 (h[0-9]).

These heading nodes should be structured in cascading <ul>-<li> structure where all the <hN+1> depending of an <hN> should be an <ul> included himself in the <li> of <hN>.

Example

To be more explicit, let’s see this example. If I have the following file document.xsl:

<?xml version="1.0" encoding="UTF-8"?>

<document>
<h1>Lorem <i>arepo</i> ipsum dolor</h1>
<h2>Lorem ipsum dolor</h2>
<p>
Sed ut <i>perspiciatis</i> unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
</p>

<h1>sit amet et consectetur</h1>
<h2>Quia adipit</h2>
<h3>aliquam quaerat</h3>
<p>
Sed ut <i>perspiciatis</i> unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
</p>
<h2>Erit et nunquam</h2>

</document>

Then the expected rendering should be:

<ul>
    <li>
        <span>Lorem <i>arepo</i> ispum dolor</span>
        <ul>
            <li><span>Lorem ipsum dolor</span></li>
        </ul>
    <li>
    <li>
        <span>Sit amet et consectetur</span>
        <ul>
            <li>
                <span>Quia adipit</span>
                <ul>
                    <li>
                    <span>aliquam quaerat</span>
                    </li>
                <ul>
            </li>
            <li><span>Erit et nunquam</span></li>
        </ul>
    <li>
</ul>

Minimal working example

For the moment, I have this maketoc.xslt

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="xml" indent="yes"/>
  
  <xsl:template match="document">
    <ul>
      <xsl:apply-templates select="*[self::h1 | self::h2 | self::h3 | self::h4 | self::h5 | self::h6 | self::h7 | self::h8 | self::h9]"/>
    </ul>
  </xsl:template>
  
  <xsl:template match="h1 | h2 | h3 | h4 | h5 | h6 | h7 | h8 | h9">
    <li>
      <span><xsl:value-of select="."/></span>
      <ul>
        <xsl:apply-templates select="following-sibling::*[1][self::h1 | self::h2 | self::h3 | self::h4 | self::h5 | self::h6 | self::h7 | self::h8 | self::h9]"/>
      </ul>
    </li>
  </xsl:template>
  
  <!-- Ignor anything else -->
  <xsl:template match="*"/>
</xsl:stylesheet>

Current rendering

But, this MWE generate this output:

<ul>
   <li>
      <span>Lorem arepo ipsum dolor</span>
      <ul>
         <li>
            <span>Lorem ipsum dolor</span>
            <ul/>
         </li>
      </ul>
   </li>
   <li>
      <span>Lorem ipsum dolor</span>
      <ul/>
   </li>
   <li>
      <span>sit amet et consectetur</span>
      <ul>
         <li>
            <span>Quia adipit</span>
            <ul>
               <li>
                  <span>aliquam quaerat</span>
                  <ul/>
               </li>
            </ul>
         </li>
      </ul>
   </li>
   <li>
      <span>Quia adipit</span>
      <ul>
         <li>
            <span>aliquam quaerat</span>
            <ul/>
         </li>
      </ul>
   </li>
   <li>
      <span>aliquam quaerat</span>
      <ul/>
   </li>
   <li>
      <span>Erit et nunquam</span>
      <ul/>
   </li>
</ul>

The problems

As you can see, their is some problems with this MWE:

  1. The firsts descending nodes in a same siblings are processed many times. They are processed as much as their depth. As example “aliquam quaerat” appear tree times because he is an <h3> node.
  2. A useless </ul> appears sometimes inside a <li> who haven’t any <ul> inside.
  3. The nodes who are not the firsts betwean theire siblings are treated as if they was <h1>. (See the case of “Erit et nunquam” who is an <h2>.

The question

How to make the heading nodes of an xml inside a cascading <ul>-<li> structures with the descendants well related to their ascendants?


Solution

  • I have adapted the answer from Converting flat hierarchy to nested hierarchy in XSLT depth to the problem presented here.

    This is an XSLT 1.0 solution but to me it seems more elegant than the XSLT 2.0/3.0 approach suggested in the other answer.

    <xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="html"/>
    
    <xsl:key name="child" match="h2|h3|h4|h5|h6" use="generate-id(preceding-sibling::*[name()=concat('h', substring-after(name(current()), 'h') - 1)][1])"/>
    
    <xsl:template match="/document">
        <ul>
            <xsl:apply-templates select="h1"/>
        </ul>
    </xsl:template>
    
    <xsl:template match="h1|h2|h3|h4|h5|h6">
         <li>
            <span>
                <xsl:copy-of select="node()" />
            </span>
            <ul>
                <xsl:apply-templates select="key('child', generate-id())"/>
            </ul>
        </li>
    </xsl:template>
    
    </xsl:stylesheet>