javascriptunicodeemoji

How can I split a string containing emoji into an array?


I want to take a string of emoji and do something with the individual characters.

In JavaScript "πŸ˜΄πŸ˜„πŸ˜ƒβ›”πŸŽ πŸš“πŸš‡".length == 13 because "β›”" length is 1, the rest are 2. So we can't do

const string = "πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§ πŸ‘¦πŸΎ 😴 πŸ˜„ πŸ˜ƒ β›” 🎠 πŸš“ πŸš‡";

const s = string.split(""); 
console.log(s);

const a = Array.from(string);
console.log(a);


Solution

  • With the upcoming Intl.Segmenter. You can do this:

    const splitEmoji = (string) => [...new Intl.Segmenter().segment(string)].map(x => x.segment)
    
    splitEmoji("πŸ˜΄πŸ˜„πŸ˜ƒβ›”πŸŽ πŸš“πŸš‡") // ['😴', 'πŸ˜„', 'πŸ˜ƒ', 'β›”', '🎠', 'πŸš“', 'πŸš‡']
    

    This also solve the problem with "πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§" and "πŸ‘¦πŸΎ".

    splitEmoji("πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§πŸ‘¦πŸΎ") // ['πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§', 'πŸ‘¦πŸΎ']
    

    According to CanIUse, this is supported by all modern browsers.

    If you need to support older browsers, as mentioned in Matt Davies' answer, Graphemer is the best solution:

    let Graphemer = await import("https://cdn.jsdelivr.net/npm/graphemer@1.4.0/+esm").then(m => m.default.default);
    let splitter = new Graphemer();
    let graphemes = splitter.splitGraphemes("πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§πŸ‘¦πŸΎ"); // ['πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§', 'πŸ‘¦πŸΎ']