So I have a blog that allows users to write content in one location using TinyMCE. The content is then sanitized, validated, and stored in the database. It is displayed on a different page. Pretty standard. However, I am allowing users to use emojis and special characters. Before I store the body content I run it through HTMLPurifier and then I run it through HTMLPurifier a second time before display.
This is the code for purifying the body content before the database
//Purify the body HTML
$purifier = new HTMLPurifier();
$news_data['body']['clean'] = $purifier->purify($news_data['body']['raw']);
The emojis appear as normal in the database but even outputting the raw data to the webpage, I get the same weird data (see screenshot)
This is the display code:
$purifier = new HTMLPurifier($config);
$purified_body = $purifier->purify($row['body']);
$clean_body = html_entity_decode($purified_body, ENT_QUOTES, 'UTF-8');
So, why aren't my emojis and special characters displaying properly if they appear raw in the database correctly?
Edit: I checked and the database collation is set to utf8mb4_unicode_ci
Thanks!
The Content-Type header was set to charset=ISO-8859-1 instead of UTF-8.
header("Content-Type: text/html; charset=utf-8");
This resolved the issue.