phpunicodeutf-8html-entities

How do I convert Unicode special characters to html entities?


I have the following string:

$string = "★ This is some text ★";

I want to convert it to html entities:

$string = "★ This is some text ★";

The solution everyone is writing about:

htmlentities("★ This is some text ★", "UTF-8");

But htmlentities can't convert all unicodes to html entities. So it just gives me the same output as the input:

★ This is some text ★

I've also tried to combine this solution with both:

header('Content-Type: text/plain; charset=utf-8');

and:

mb_convert_encoding();

But this either prints and empty result, doesn't convert at all or wrongly converts the stars to:

Â

How to I convert ★ and all other unicode characters to the correct html entity?


Solution

  • htmlentities won't work in this case, but you can try to UCS-4 encode the string, something like :

    $string = "★ This is some text ★";
    $entity = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
        $char = current($m);
        $utf = iconv('UTF-8', 'UCS-4', $char);
        return sprintf("&#x%s;", ltrim(strtoupper(bin2hex($utf)), "0"));
    }, $string);
    echo $entity;
    

    ★ This is some text ★
    

    Ideone-Demo