I would like to print strings encoded like this one: "Cze\u00c5\u009b\u00c4\u0087"
but I have no idea how. The example string should be printed as: "Cześć".
What I have tried is:
str = "Cze\u00c5\u009b\u00c4\u0087"
print(str)
#gives: CzeÅÄ
str_bytes = str.encode("unicode_escape")
print(str_bytes)
#gives: b'Cze\\xc5\\x9b\\xc4\\x87'
str = str_bytes.decode("utf8")
print(str)
#gives: Cze\xc5\x9b\xc4\x87
Where
print(b"Cze\xc5\x9b\xc4\x87".decode("utf8"))
gives "Cześć", but I don't know how to transform the "Cze\xc5\x9b\xc4\x87"
string to the b"Cze\xc5\x9b\xc4\x87"
bytes.
I also know that the problem are additional backslashes in the byte representation after encoding the basis string with "unicode_escape"
parameter, but I don't know how to get rid of them - str_bytes.replace(b'\\\\', b'\\')
doesn't work.
Use raw_unicode_escape
:
text = 'Cze\u00c5\u009b\u00c4\u0087'
text_bytes = text.encode('raw_unicode_escape')
print(text_bytes.decode('utf8')) # outputs Cześć