I'm having hard times removing diacritics from some $string
. My code is
<?php
$string = "Příliš žluťoučký kůň úpěl ďábelské ódy.";
$without_diacritics = strTr($string, "říšžťčýůúěďó", "risztcyuuedo");
echo $without_diacritics;
while expected output would be Prilis zlutoucky kun upel dabelske ody.
Instead, I'm receiving very weird response:
Puiszliuc uuluueoudoks� ku�u� s�pd�l d�scbelsks� s�dy.
I've thought that it could be a problem with multi-byte characters, but I've found that the strtr
is multi-byte safe. Is my assumption wrong? What am I missing?
The problem is that your input translation string is twice as big as the output translation string (because of Unicode) and strtr()
works with bytes instead of characters; a translation array would be better in this case:
$string = "Příliš žluťoučký kůň úpěl ďábelské ódy.";
echo strtr($string, [
'ř' => 'r',
'í' => 'i',
'š' => 's',
'ž' => 'z',
'ť' => 't',
'č' => 'c',
'ý' => 'y',
'ů' => 'u',
'ú' => 'u',
'ě' => 'e',
'ď' => 'd',
'ó' => 'o'
]);
Output:
Prilis zlutoucky kuň upel dábelské ody.