swiftnsstringfoundationcore-foundationcfstring

Extra \N{...} when using kCFStringTransformToUnicodeName or NSStringTransformToUnicodeName


let string = "\u{00A0}" // no-break space
let transformed = string.stringByApplyingTransform(NSStringTransformToUnicodeName, reverse: false)

Expected result: NO-BREAK SPACE

Actual result: \N{NO_BREAK_SPACE}

Why the extra \N{ and }? What are they for, and is there any way to remove them, short of regex/scanning/parsing/etc?


Solution

  • That's the way ICU & Unicode represent named code points in Regular Expressions. So I'm not surprised by that output at all.

    Here is a link that reference this syntax at unicode.org.

    That's also explained in this other page at ICU Project.

    PS: \N{} is actually the shorter equivalent to \p{name=…} — as explained in that unicode.org page above that linked anchor). You can see similar syntaxes like in regular-expressions.info that mention that \p{…} syntax for defining Unicode CodePoints using their properties.