xslt

Merge Direct Sibling Elements Interspersed with Text


I apologize in advance if this has already been asked and answered in another post.

I've been banging my head for a bit on this problem and haven't yet found a complete solution.

Here's my problem:

I have several cases where a parent element contains multiple sets of direct sibling elements. Each set needs to be merged into a single element along with any of their children elements. And each set is interspersed with text. For example:

XML Input

<p>Information about specific mio levels, accessed as 
<codeph><i>metaMio</i></codeph><codeph>[</codeph><codeph><i>N</i></codeph><codeph>]</codeph>
, in which
<codeph><i>N</i></codeph><codeph> &lt; kMaxMioLevels</codeph>
.
</p>

Desired XML Output

<p>Information about specific mio levels, accessed as 
<codeph><i>metaMio</i>[<i>N</i>]</codeph>
, in which
<codeph><i>N</i> &lt; kMaxMioLevels</codeph>
.
</p>

Here's what I've tried so far, but I've found it duplicates the contents of the second element in the second set:

Attempted XSLT

    <xsl:template match="codeph[not(preceding-sibling::node()[1][./local-name()='codeph'])][following-sibling::node()[1][./local-name()='codeph']]">
        <xsl:copy copy-namespaces="no">
            <xsl:apply-templates/>
            <xsl:choose>
                <xsl:when test="following-sibling::codeph[preceding-sibling::node()[position()=1][local-name()='codeph']]">
                    <xsl:apply-templates select="following-sibling::codeph[preceding-sibling::node()[position()=1][local-name()='codeph']]" mode="codeph-merger"/>
                </xsl:when>
                <xsl:when test="following-sibling::*[1][text()[not(parent::codeph)]]"/>
            </xsl:choose>            
        </xsl:copy>
    </xsl:template>
    
    <xsl:template match="*" mode="codeph-merger">
        <xsl:apply-templates/>
    </xsl:template>
    
    <xsl:template match="codeph[preceding-sibling::node()[1][local-name()='codeph']]"/>

Current Output

The above XSLT creates the following output, which is close but duplicates text (&lt; kMaxMipLevels):

<p>Information about specific mio levels, accessed as
<codeph><i>metaMio</i>[<i>N</i>] &lt; kMaxMipLevels</codeph>
, in which 
<codeph><i>N</i> &lt; kMaxMipLevels</codeph>
.
</p>

I've also attempted to use the group-adjacent as part of the xsl:for-each-group but couldn't find any way for it to work on this case.

I'm using XSLT 2.0 and Saxon-HE 12.4 as part of the DITA-OT 4.2.4.

Thanks for any help you can provide.


Solution

  • A better explanation of the required logic would be useful. As it is, the following code seems to produce the wanted result in the given example:

    XSLT 2.0

    <xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <!-- identity transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    
    <xsl:template match="p[codeph]">
        <xsl:copy>
            <xsl:for-each-group select="codeph|text()" group-adjacent="boolean(self::codeph)">  
                <xsl:choose>
                    <xsl:when test="current-grouping-key()">
                        <codeph>
                            <xsl:apply-templates select="current-group()/node()"/> 
                        </codeph>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:apply-templates select="current-group()"/> 
                    </xsl:otherwise>  
                </xsl:choose>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>