I build a script that should generate an sitemap for my project.
This script use strtr() to replace unwanted signs and also convert German umlauts.
$ers = array( '<' => '', '>' => '', ' ' => '-', 'Ä' => 'Ae', 'Ö' => 'Oe', 'Ü' => 'Ue', 'ä' => 'ae', 'ö' => 'oe', 'ü' => 'ue', 'ß' => 'ss', '&' => 'und', '*' => '', ' - ' => '-', ',' => '', '.' => '', '!' => '', '?' => '' );
foreach ($rs_post as $row) {
$kategorie = $row['category'];
$kategorie = strtr($kategorie,$ers);
$kategorie = strtolower($kategorie);
$kategorie = trim($kategorie);
$org_file .= "<url><loc>https://domain.org/kategorie/" . $kategorie . "/</loc><lastmod>2016-08-18T19:02:42+00:00</lastmod><changefreq>monthly</changefreq><priority>0.2</priority></url>" . PHP_EOL;
}
Unwanted signs like "<" will be replaced correctly, but the German umlauts are not converted. I have no idea why.
Someone has a tipp for me?
Torsten
As others have noted, the most likely cause is a character encoding mismatch. Since the titles you're trying to convert are apparently in UTF-8, the problem is most likely that your PHP source code isn't. Try re-saving the file as UTF-8 text, and see if that fixes the problem.
BTW, a simple way to debug this would be to print out both your data rows and your transliteration array into the same output file using e.g. print_r()
or var_dump()
, and look at the output to see if the non-ASCII characters in it look correct. If the characters look right in the data but wrong in the transliteration table (or vice versa), that's a sign that the encodings don't match.
Ps. If you have the PHP iconv extension installed (and you probably do), consider using it to automatically convert your titles to ASCII.