phpunicodegetquery-stringdouble-byte

Double-byte characters in querystring using PHP


I'm trying to figure out how to create personalized urls for double-byte languages.

For example, this url from Amazon Japan has Japanese characters within the querystring (specifically, the path):

http://www.amazon.co.jp/風の谷のナウシカ-DVD-宮崎駿/dp/B00005R5J3/ref=sr_1_3?ie=UTF8&s=dvd&qid=1269891925&sr=8-3

What I would like to do is have:

http://www.mysite.com/風の谷のナウシカ

or even

http://www.mysite.com/index.php?name=風の谷のナウシカ

be able to properly decode the $GET[name] string.

I think I have tried all of the urldecode and utf8_decode possibilities, but I just get gibberish in response.

This all works fine in a form $_POST, but I need these urls to be emailable...

EDIT: Here is the code I'm using:

<p>Original: <?= $_GET[str]; ?>

<br>Decode: <?= urldecode($_GET[str]); ?>

<br>Decode querystring: <?= urldecode($_SERVER[QUERY_STRING]); ?>

<p>

<?
   while (list($var,$value) = each ($_SERVER)) {
      echo "$var => $value <br />";
   }
?>

Solution

  • Got it!

    I needed to make sure the header was reporting:

    header ('Content-type: text/html; charset=utf-8');
    

    Once I did that, the characters were interpreted properly.

    I also found this, which is a very good resource:

    http://www.phpwact.org/php/i18n/utf-8