phpajaxurdu

api returning strange character instead of foriegn language


I am trying to use an ajax based API to get content in Urdu language but the problem is whenever I access the api I see strange characters which I think are not encoded properly by the server before it returns the results

The api endpoint

To return the proper Urdu characters you need to use mb_convert_encoding function before sending it to the client but as it's just public api and I can't access their server I can't do this

I want to convert them back to proper Urdu characters

Something like this

$strangeLetters = '\u06c1\u062a\u06d2 \u0641\u0644\u0633\u0637\u06cc\u0646\u06cc\u0648\u06ba \u067e\u0631 \u0627\u0633\u0631\u0627\u0626\u06cc\u0644\u06cc \u062c\u0627\u0631\u062d\u06cc\u062a \u062c\u0646\u06af\u06cc \u062c\u0631\u0627\u0626\u0645 \u06a9\u06d2 \u0632\u0645\u0631\u06d2 \u0645\u06cc\u06ba \u0622\u062a\u06cc \u06c1\u06d2\u060c \u0648\u0632\u06cc\u0631\u0627\u0639\u0638\u0645';

$properUrduCharacters = someFunction(
$strangeLetters);

echo $properUrduCharacters;

Result:

ہتے فلسطینیوں پر اسرائیلی جارحیت جنگی جرائم کے زمرے میں آتی ہے، وزیراعظم

Solution

  • The quick and easy way to show unicode data with PHP:

    echo json_decode('"\u06c1"');

    Other solutions here: How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?

    For your example:

    <?php
    $strangeLetters = '\u06c1\u062a\u06d2 \u0641\u0644\u0633\u0637\u06cc\u0646\u06cc\u0648\u06ba \u067e\u0631 \u0627\u0633\u0631\u0627\u0626\u06cc\u0644\u06cc \u062c\u0627\u0631\u062d\u06cc\u062a \u062c\u0646\u06af\u06cc \u062c\u0631\u0627\u0626\u0645 \u06a9\u06d2 \u0632\u0645\u0631\u06d2 \u0645\u06cc\u06ba \u0622\u062a\u06cc \u06c1\u06d2\u060c \u0648\u0632\u06cc\u0631\u0627\u0639\u0638\u0645';
    
    $strange = explode('\u', $strangeLetters);
    
    
    foreach($strange as $letter){
      echo json_decode('"\u'.$letter.'"');
    }
    
    var_dump($strange);