The simplexml_load_file() function doesn't parse the accent characters well. The file is UTF-8 encoded, the xml tag has encoding="UTF-8".
I'm importing an XML file encoded in UTF-8 with simplexml_load_file() function. This file has some accent characters, and when I do a print_r() or var_dump() the accent characters are converted to strange characters.
First line in XML file is
<?xml version="1.0" encoding="UTF-8"?>
In code I'm running the basic
$xFile = simplexml_load_file($xmlFile)
I'm looping through the SimpleXML Object and fetching the word with accent characters like so
$text = (string)$p->i
Now
var_dump($text);
shows Geïrriteerd
instead of Geïrriteerd
I've tried to get_file_contents() and then simplexml_load_string() and I've also tried to load the XML file with DOMDocument, but the same 'wild' characters are being displayed.
Any thoughts on what else could I do?
Note: I'm working on PHP5.4, that's the PROD version and I can't change it.
The issue was a windows console default encoding.
I've changed the encoding to UTF-8 by running chcp 65001
.
@Phil's comment was helpful.