Given the following xml:
<items>
<item id="item1">
<description id="desc">
<?RELAPP description="Relative" loc="start"?>
<heading id="h1" level="1">HEADING 1</heading>
<p id="p2" num="1">Paragraph A</p>
<?RELAPP description="Relative" loc="end"?>
<?SUMM description="Summary" loc="start"?>
<heading id="h2" level="1">HEADING 2</heading>
<p id="p3" num="2">Paragraph B</p>
<p id="p4" num="3">Paragraph C</p>
<heading id="h3" level="1">HEADING 3</heading>
<p id="p5" num="4">Paragraph D</p>
<p id="p6" num="5">Paragraph E</p>
<?SUMM description="Summary" loc="end"?>
<?drawings description="Drawings" loc="start"?>
<drawings>
<heading id="h4" level="1">HEADING 4</heading>
<p id="p7" num="6">Paragraph F</p>
<p id="p8" num="7">Paragraph G</p>
</drawings>
<?drawings description="Drawings" loc="end"?>
</description>
</item>
</items>
I'm trying to get to the text between:
<?SUMM description="Summary" loc="start"?>
and
<?SUMM description="Summary" loc="end"?>
That is:
HEADING 2 Paragraph B Paragraph C HEADING 3 Paragraph D Paragraph E
hopefully with some separation between the Headings and Paragraphs.
The best xsl I've been able to come up with is:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/items">
<myItems>
<xsl:apply-templates/>
</myItems>
</xsl:template>
<xsl:template match="item">
<xsl:element name="info">
<xsl:element name="summaryPI">
<xsl:for-each select="description/processing-instruction('SUMM')">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
but it only gets me this:
<?xml version="1.0" encoding="UTF-8"?>
<myItems>
<info>
<summaryPI>description="Summary" loc="start"description="Summary" loc="end"</summaryPI>
</info>
</myItems>
What rule should I use to get the text I want? I tried with preceding-sibling and following-sibling but I couldn't get it to work. I'm using version 1.0.
How about:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:for-each select="//text()[preceding::processing-instruction('SUMM')[contains(., 'loc="start"')]]
[following::processing-instruction('SUMM')[contains(., 'loc="end"')]] ">
<xsl:value-of select="." />
<xsl:if test="position()!=last()">
<xsl:text>, </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Applied to your input example, the result will be:
HEADING 2, Paragraph B, Paragraph C, HEADING 3, Paragraph D, Paragraph E
Note: if it can be assumed that all the nodes in-between the two processing instructions are siblings (as they are in your example), then this could be made a little more efficient by using:
<xsl:for-each select="//*[preceding-sibling::processing-instruction('SUMM')[contains(., 'loc="start"')]]
[following-sibling::processing-instruction('SUMM')[contains(., 'loc="end"')]] ">