phpxmlrss

How to read this XML with that has HTML tags with PHP?


I have been working several times with php and XML but this kind of XML has Html tags in the beginning and in the end:

Link To XML

there is no direct link to the xml file so I have to use file_get_contents().

Im using this php code:

 $url = "https://www.tandildiario.com/suscripcion.php?section=4";
 $xml   = file_get_contents($url);
 $feed = simplexml_load_string($xml);

  foreach ($feed->channel->item as $item) {
  .....

I try different thing ..most of the errors are like this:

Warning: simplexml_load_string(): Entity: line 14: parser error : Entity 'oacute' not defined in D:\reader.php on line 37


Solution

  • You could decode the HTML entities prior to loading the XML.

    $url = "https://www.tandildiario.com/suscripcion.php?section=5";
    $xml = file_get_contents($url);
    
    $feed = simplexml_load_string(html_entity_decode($xml, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401, "UTF-8"));
    
    foreach ( $feed->channel->item as $item )   {
        echo $item->asXML();
    }
    

    Update:

    Since this answer was written 7 years ago, passing null to the second parameter if html_entity_decode has been deprecated. I've updated the answer.