I was trying to take out all emoji chars out of a string (like a sanitizer). But I cannot find a complete set of emoji values.
What is the complete set of emoji chars' UTF16 values?
The Unicode standard's Unicode® Technical Report #51 includes a list of emoji (emoji-data.txt):
...
21A9 ; text ; L1 ; none ; j # V1.1 (↩) LEFTWARDS ARROW WITH HOOK
21AA ; text ; L1 ; none ; j # V1.1 (↪) RIGHTWARDS ARROW WITH HOOK
231A ; emoji ; L1 ; none ; j # V1.1 (⌚) WATCH
231B ; emoji ; L1 ; none ; j # V1.1 (⌛) HOURGLASS
...
I believe you would want to remove each character listed in this document which had a Default_Emoji_Style
of emoji
.
There is no way, other than reference to a definition list like this, to identify the emoji characters in Unicode. As the reference to the FAQ says, they are spread throughout different blocks.