javascriptunicodeemoji

How can I split a string containing emoji into an array?


I want to take a string of emoji and do something with the individual characters.

In JavaScript "😴😄😃⛔🎠🚓🚇".length == 13 because "⛔" length is 1, the rest are 2. So we can't do

var string = "😴😄😃⛔🎠🚓🚇";
s = string.split(""); 
console.log(s);


Solution

  • The Grapheme Splitter library by Orlin Georgiev is pretty amazing.

    Although it hasn't been updated in a while and presently (Sep 2020) it only supports Unicode 10 and below.

    For an updated version of Grapheme Splitter built in Typescript with Unicode 13 support have a look at: https://github.com/flmnt/graphemer

    Here is a quick example:

    import Graphemer from 'graphemer';
    
    const splitter = new Graphemer();
    
    const string = "😴😄😃⛔🎠🚓🚇";
    
    splitter.countGraphemes(string); // returns 7
    
    splitter.splitGraphemes(string); // returns array of characters
    
    

    The library also works with the latest emojis.

    For example "👩🏻‍🦰".length === 7 but splitter.countGraphemes("👩🏻‍🦰") === 1.

    Full disclosure: I created the library and did the work to update to Unicode 13. The API is identical to Grapheme Splitter and is entirely based on that work, just updated to the latest version of Unicode as the original library hasn't been updated for a couple of years and seems to be no longer maintained.