Our PHP page was just a UTF-8 webpage consisting of Chinese characters in the meta descriptions.
I don't know why when someone tried to share the links into Whatsapp, it showed broken letters.
But I shared it to find it non-broken (normal).
What are the possible reasons behind it? We added both:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
and
header('Content-Type: text/html; charset=UTF-8');
Someone has any clue? Thanks!
==========
The software in use (let's say blogging software) does not handle the UTF-8 encoded content well which results in non-UTF8 output to the Browser.
It's not that the blogging software would be flawed in all content operations, more the opposite is the case, it is flawed only in some content operations. But it happens on each page I've seen there and is enough to make a simple UTF-8 check fail:
$ curl -s 'http://entrepreneur-times.com/l/tch/blog/?id=12' \
| php -r 'var_dump(preg_match("~~u", file_get_contents("php://stdin")));'
bool(false)
The problem is the generation of description texts (HTML meta tags for description and og:description). That part of the software does not take the content Unicode UTF-8 encoding into account and just cuts off at some binary length (most likely, I haven't seen the code). That way of cutting breaks the UTF-8 output.
The fix is here to remove the flaw from the software.