pythonpython-2.7asciidecodingstring-decoding

Decode a ASCII string with Python 2.7.10


I'm fairly new to Python so I'm probably still making a lot of rookie mistakes.

I was comparing two seemingly matching strings in Python, but it always returned false. When I checked the representation of the object, I found that one of the strings was encoded in ASCII.

The representation of the first string returns:

'\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'

While the representation of the second string returns:

"itinerary_options_search_button" = "Launch the search";

I'm trying to figure out how to decode the first string to get the second string, so that my comparison of the two will match. When I decode the first string with

string.decode('ascii')

I get a unicode object. I'm not sure what to do to get the decoded string.


Solution

  • Your first string seems to have some issues. I'm not entirely sure why there is so many null characters (\x00), but either way, we could write a function to clean those up:

    s_1 = '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
    s_2 = '"itinerary_options_search_button" = "Launch the search";'
    
    def null_cleaner(string):
        new_string = ""
        for char in string:
            if char != "\x00":
                new_string += char
        return new_string
    
    print(null_cleaner(s_1) == null_cleaner(s_2))
    

    A little bit less robust way of doing this is to simply splice the string to remove every other character (which happens to be \x00):

    s_1 = '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
    s_2 = '"itinerary_options_search_button" = "Launch the search";'
    
    print(s_1[1::2] == s_2)