My job is to extract data from an XML document and build an HTML page using that data. I'm using PHP to parse and manipulate the XML document.
One portion of the XML document contains inlined elements used in a fashion similar to this:
<desc>
These are the <special><best></special>
chocolate chip cookies <special><EVER></special>
</desc>
I'd like to convert this into my HTML document like so:
These are the <em><best></em> chocolate chip cookies <em><EVER></em>
So that it displays in the browser as
These are the <best> chocolate chip cookies <EVER>
I'm currently using PHP's SimpleXML
module. I have no problem parsing the XML document and retrieving the parent element (<desc>
).
I thought about manipulating the raw XML string and doing a search and replace to convert the <special>
tags to my target tag (<em>
), but, of course, XML will just parse it just the same, only under the name <em>
instead.
I also considered retrieving the XML directly from the <desc>
node at the point of use with asXML()
and then doing the search and replace there and then simply echoing the raw string into the HTML document, but at that point it appears that the <special>
nodes have already been parsed away and I just get the string:
These are the <best> chocolate chip cookies <EVER>
I've also looked into the XMLReader
class, but it seems to read the XML from a stream, so I can't access the nodes I need arbitrarily when I need them.
I'd appreciate any advice. Thanks.
Here is a solution that creates a DOM object from a SimpleXMLElement
, and iterates over its child nodes to build the HTML:
$xml = <<<XML
<desc>
These are the <special><best></special> chocolate chip cookies <special><EVER></special>
</desc>
XML;
$sx = new SimpleXMLElement($xml);
$dom = dom_import_simplexml($sx);
$html = '';
foreach($dom->childNodes as $node)
{
switch($node->nodeType)
{
case XML_ELEMENT_NODE:
if($node->tagName=='special')
$html .= '<em>'.htmlspecialchars($node->textContent).'</em>';
break;
case XML_TEXT_NODE:
$html .= htmlspecialchars($node->data);
break;
}
}
echo trim($html);
Output:
These are the <em><best></em> chocolate chip cookies <em><EVER></em>
(demo)