I'm fairly new to Python so I'm probably still making a lot of rookie mistakes.
I was comparing two seemingly matching strings in Python, but it always returned false. When I checked the representation of the object, I found that one of the strings was encoded in ASCII.
The representation of the first string returns:
'\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
While the representation of the second string returns:
"itinerary_options_search_button" = "Launch the search";
I'm trying to figure out how to decode the first string to get the second string, so that my comparison of the two will match. When I decode the first string with
string.decode('ascii')
I get a unicode object. I'm not sure what to do to get the decoded string.
Your first string seems to have some issues. I'm not entirely sure why there is so many null characters (\x00
), but either way, we could write a function to clean those up:
s_1 = '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
s_2 = '"itinerary_options_search_button" = "Launch the search";'
def null_cleaner(string):
new_string = ""
for char in string:
if char != "\x00":
new_string += char
return new_string
print(null_cleaner(s_1) == null_cleaner(s_2))
A little bit less robust way of doing this is to simply splice the string to remove every other character (which happens to be \x00
):
s_1 = '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
s_2 = '"itinerary_options_search_button" = "Launch the search";'
print(s_1[1::2] == s_2)