I am using <xsl:template match="m:*/text()">
to match text in my XML Document, which is working fine for plain text and known entities, i.e. it works fine for entities like &
or unicode entities like π
.
However what's not working is matching custom entity names. For example I have an entity π
in my XML Document, that should be matched using text()
. For some reason it does not treat that entity as text, meaning nothing is being matched.
Please note that I did declare the entity name in the Doctype declaration of the XML Document, and of the XSLT Document as well:
<!DOCTYPE xsl:stylesheet [<!ENTITY pi "π">]>
Is text()
the right approach to matching custom entity names, or do I need to use another function? (Maybe I also did something wrong declaring the entity name?)
Thanks
Edit
XML
<!DOCTYPE mathml [<!ENTITY pi "π">]>
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
<mi>π</mi>
<mi>test</mi>
<mi>π</mi>
</math>
XSLT
<?xml version='1.0' encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet [<!ENTITY pi "π">]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:m="http://www.w3.org/1998/Math/MathML"
version='1.0'>
<xsl:template match="m:*/text()">
<xsl:call-template name="replaceEntities">
<xsl:with-param name="content" select="normalize-space()"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="replaceEntities">
<xsl:param name="content"/>
<xsl:value-of select="$content"/>
</xsl:template>
</xsl:stylesheet>
The variable $content
should get printed three times, however only test
and π
is printed.
Processing using PHP
$xslDoc = new DOMDocument();
$xslDoc->load("doc.xsl");
$xslProcessor = new \XSLTProcessor();
$xslProcessor->importStylesheet($xslDoc);
$mathMLDoc = new DOMDocument();
$mathMLDoc->loadXML('<!DOCTYPE mathml [<!ENTITY pi "π">]><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>π</mi><mi>test</mi><mi>π</mi></math>');
echo $xslProcessor->transformToXML($mathMLDoc);
As far as I can see, the problem is that the DTD is not visible to the XSLT stylesheet. Use the following to substitute entities with their textual value before transforming the document:
$mathMLDoc->substituteEntities = true;
as in
$xslDoc = new DOMDocument();
$xslDoc->load("tree.xsl");
$xslProcessor = new \XSLTProcessor();
$xslProcessor->importStylesheet($xslDoc);
$mathMLDoc = new DOMDocument();
$mathMLDoc->substituteEntities = true;
$mathMLDoc->loadXML('<!DOCTYPE math [<!ENTITY pi "π">]><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>π</mi><mi>test</mi><mi>π</mi></math>');
echo $xslProcessor->transformToXML($mathMLDoc);
which will produce
<?xml version="1.0"?>
πtestπ
Some background: http://php.net/manual/en/xsltprocessor.transformtoxml.php#99932 and http://hublog.hubmed.org/archives/001854.html.