swiftstringemoji-tones

How to determine the display count of a Swift String?


I've reviewed questions such as Get the length of a String and Why are emoji characters like 👩‍👩‍👧‍👦 treated so strangely in Swift strings? but neither cover this specific question.

This all started when trying to apply skin tone modifiers to Emoji characters (see Add skin tone modifier to an emoji programmatically). This led to wondering what happens when you apply a skin tone modifier to a regular character such as "A".

Examples:

let tonedThumbsUp = "πŸ‘" + "🏻" // πŸ‘πŸ»
let tonedA = "A" + "🏾" // A🏾

I'm trying to detect that second case. The count of both of those strings is 1. And the unicodeScalars.count is 2 for both.

How do I determine if the resulting string appears as a single character when displayed? In other words, how can I determine if the skin tone modifier was applied to make a single character or not?

I've tried a few ways to dump information about the string but none give the desired result.

func dumpString(_ str: String) {
    print("Raw:", str, str.count)
    print("Scalars:", str.unicodeScalars, str.unicodeScalars.count)
    print("UTF16:", str.utf16, str.utf16.count)
    print("UTF8:", str.utf8, str.utf16.count)
    print("Range:", str.startIndex, str.endIndex)
    print("First/Last:", str.first == str.last, str.first, str.last)
}

dumpString("A🏽")
dumpString("\u{1f469}\u{1f3fe}")

Results:

Raw: A🏽 1
Scalars: A🏽 2
UTF16: A🏽 3
UTF8: A🏽 3
First/Last: true Optional("A🏽") Optional("A🏽")
Raw: πŸ‘©πŸΎ 1
Scalars: πŸ‘©πŸΎ 2
UTF16: πŸ‘©πŸΎ 4
UTF8: πŸ‘©πŸΎ 4
First/Last: true Optional("πŸ‘©πŸΎ") Optional("πŸ‘©πŸΎ")

Solution

  • What happens if you print πŸ‘πŸ» on a system that doesn't support the Fitzpatrick modifiers? You get πŸ‘ followed by whatever the system uses for an unknown character placeholder.

    So I think to answer this, you must consult your system's typesetter. For Apple platforms, you can use Core Text to create a CTLine and then count the line's glyph runs. Example:

    import Foundation
    import CoreText
    
    func test(_ string: String) {
        let richText = NSAttributedString(string: string)
        let line = CTLineCreateWithAttributedString(richText as CFAttributedString)
        let runs = CTLineGetGlyphRuns(line) as! [CTRun]
        print(string, runs.count)
    }
    
    test("πŸ‘" + "🏻")
    test("A" + "🏾")
    test("B\u{0300}\u{0301}\u{0302}" + "🏾")
    

    Output from a macOS playground in Xcode 10.2.1 on macOS 10.14.6 Beta (18G48f):

    πŸ‘πŸ» 1
    A🏾 2
    BΜ€ΜΜ‚πŸΎ 2