pythonunicodeisalpha

Why "ǃ".isalpha() is True but "!".isalpha() is False?


I have just found this strange behaviour parsing data from IANA.

"ǃ".isalpha() # returns True
"!".isalpha() # returns False

Apparently, the two exclamation marks are different:

In [62]: hex(ord("ǃ"))                                                          
Out[62]: '0x1c3'

In [63]: hex(ord("!"))                                                          
Out[63]: '0x21'

Just wondering is there a way to prevent this to happen? What is the origin of this behaviour?


Solution

  • Check characters in Unicode Database. The exclamation-like ǃ (\u1c3) is a letter:

    import unicodedata
    for c in "!ǃ":
        print(c,'{:04x}'.format(ord(c)),unicodedata.category(c), unicodedata.name(c))
    
    ! 0021 Po EXCLAMATION MARK
    ǃ 01c3 Lo LATIN LETTER RETROFLEX CLICK