python

Python: Get size of string in bytes


I have a string that is to be sent over a network. I need to check the total bytes it is represented in.

sys.getsizeof(string_name) returns extra bytes. For example, for sys.getsizeof("a") returns 22, while one character is only represented in 1 byte in Python. Is there some other method to find this?


Solution

  • If you want the number of bytes in a string, this function should do it for you pretty solidly.

    def utf8len(s):
        return len(s.encode('utf-8'))
    

    The reason you got weird numbers is because encapsulated in a string is a bunch of other information due to the fact that strings are actual objects in Python.

    It’s interesting because if you look at my solution to encode the string into 'utf-8', there's an 'encode' method on the 's' object (which is a string). Well, it needs to be stored somewhere right? Hence, the higher than normal byte count. Its including that method, along with a few others :).