When I run these methods
s.isdigit()
s.isnumeric()
s.isdecimal()
I always get as output either all True
or all False
for each value of s
(which is a string).
What's the difference between the three? Can you provide an example that gives two True
s and one False
(or vice versa)?
It's mostly about unicode classifications. Here's some examples to show discrepancies:
>>> def spam(s):
... for attr in 'isnumeric', 'isdecimal', 'isdigit':
... print(attr, getattr(s, attr)())
...
>>> spam('½')
isnumeric True
isdecimal False
isdigit False
>>> spam('³')
isnumeric True
isdecimal False
isdigit True
Specific behaviour is in the official docs here.
Script to find all of them:
import sys
import unicodedata
from collections import defaultdict
d = defaultdict(list)
for i in range(sys.maxunicode + 1):
s = chr(i)
t = s.isnumeric(), s.isdecimal(), s.isdigit()
if len(set(t)) == 2:
try:
name = unicodedata.name(s)
except ValueError:
name = f'codepoint{i}'
print(s, name)
d[t].append(s)