I'm attempting to remove accents from characters in PHP string as the first step to making the string usable in a URL.
I'm using the following code:
$input = "Fóø Bår";
setlocale(LC_ALL, "en_US.utf8");
$output = iconv("utf-8", "ascii//TRANSLIT", $input);
print($output);
The output I would expect would be something like this:
F'oo Bar
However, instead of the accented characters being transliterated they are replaced with question marks:
F?? B?r
Everything I can find online indicates that setting the locale will fix this problem, however I'm already doing this. I've already checked the following details:
locale -a
)iconv -l
)mb_check_encoding
function, as suggested in the answer by mercator)setlocale
is successful (it returns 'en_US.utf8'
rather than FALSE
)The server is using the wrong implementation of iconv. It has the glibc version instead of the required libiconv version.
Note that the iconv function on some systems may not work as you expect. In such case, it'd be a good idea to install the GNU libiconv library. It will most likely end up with more consistent results.
– PHP manual's introduction to iconv
Details about the iconv implementation that is used by PHP are included in the output of the phpinfo
function.
(I'm not able to re-compile PHP with the correct iconv library on the server I'm working with for this project so the answer I've accepted below is the one that was most useful for removing accents without iconv support.)
I think the problem here is that your encodings consider ä and å different symbols to 'a'. In fact, the PHP documentation for strtr offers a sample for removing accents the ugly way :(