In Python 3, how to convert an ASCII raw-string (that includes escape characters) into a proper unicode string?
As an example:
a = "ä" # note the umlaut
b = bytearray( a, "utf8" ) # yields: bytearray(b'\xc3\xa4')
s = r'\xc3\xa4' # note it's a raw string
In the example you can see how my source string s
derives from the unicode string a
, informed by b
. The goal is to find a function, F
, such that a == F(s)
. Thanks for your help!
I tried every combination of encode and decode and codecs that I could think of. Note, in particular, that the following yields False
:
a == s.encode('latin-1').decode('unicode-escape')
You were so close!
s.encode('latin-1').decode('unicode-escape').encode('latin-1').decode('utf-8')