swiftunicodenscharacterset

How to know which CharacterSet contains a given character?


Is there a way to check if a character belongs to a CharacterSet?

I wanna know what CharacterSet should I use for character -. Do I use symbols?

I've checked this documentation but still no idea. https://developer.apple.com/documentation/foundation/characterset

When removing extra whitespace at the end of a string, we do it like this:

let someString = " "
print("\(11111) - \(someString)".trimmingCharacters(in: .whitespaces))

But what if I just want to remove the -? Or any special character such as *?

EDIT: I was looking for a complete set of characters per each CharacterSet if it's possible.


Solution

  • What you want is defined in the Unicode standard. It is referred to as Unicode General Categories. Each Unicode character is in a category.

    The Unicode website provides a complete character list showing the character's code, category, and name. You can also find a complete list of Unicode categories as well.

    The - is U+2D (HYPHEN-MINUS). It is listed as being in the "Pd" (punctuation) category.

    If you look at the documentation for CharacterSet, you will see punctuationCharacters which is documented as:

    Returns a character set containing the characters in Unicode General Category P*.

    The "Pd" category is included in "P*" (which means any "P" category).

    I also found https://www.compart.com/en/unicode/category which is a third party list of each character by category. A bit more user friendly than the Unicode reference.

    To summarize. If you want to know which CharacterSet to use for a given character, lookup the character's category using one of the charts I linked. Once you know its category, look at the documentation for CharacterSet to see which predefined character set applies to that category.