rubystringcharacter-encodingruby-2.6

How to split string with accented characters in ruby


Currently I got :

"mɑ̃ʒe".split('')
# => ["m", "ɑ", "̃", "ʒ", "e"]

I would like to get this result

"mɑ̃ʒe".split('')
# => ["m", "ã", "ʒ", "e"]

Solution

  • Use String#each_grapheme_cluster instead. For example:

    "mɑ̃ʒe".each_grapheme_cluster.to_a
    #=> ["m", "ɑ̃", "ʒ", "e"]