There is a string
x = '1a\u0398\t\u03B43s'
How to count its length only using code?
I mean to add before the string r
manually is no good
(x = r'1a\u0398\t\u03B43s'
).
Have tried this solution, but still no good (it counts 9 symbols and should be 18):
x = '1a\\u0398\\t\\u03B43s'
decoded_s = x.encode().decode('unicode_escape')
print(f'Symbols: {len(decoded_s)}'))
returns 9
Because you can't convert it into raw string - you can force cast bytes representation to string as follows:
You wish to count what's in between single quotes:
>>> x.encode("unicode_escape")
b'1a\\u0398\\t\\u03b43s'
python-ish conversion is not what you're after:
>>> x.encode("unicode_escape").decode("unicode_escape")
'1aΘ\tδ3s'
you can force convert it to string by casting bytes to ascii:
>>> x.encode("unicode_escape").decode('ascii')
'1a\\\\u0398\\\\t\\\\u03B43s'
>>> len(x.encode("unicode_escape").decode('ascii'))
21
Now with \
it's a bit more complicated - your raw query has 2 \\
, but since it's not-raw string it will escape all of them, so even though you see 4 \\\\
it counts 3 \\\
with simple len(...)
- therefore you want to subtract 1 for each occurrence (count will count correctly 4 \\\\
).
>>> y = x.encode("unicode_escape").decode('ascii')
>>> len(y) - y.count("\\\\")
18