pythonmemory

asizeof appears to be inaccurate


Take this MWE:

from pympler import asizeof
from random import randint, choice
from string import printable
from heapq import heappush
ascii =  printable[:-5]
pq = []
for _ in range(10_000_000):
    heappush(pq, (randint(0, 31), randint(0, 31), randint(0, 31), ''.join(choice(ascii) for _ in range(16))))
print(asizeof.asizeof(pq))

I can see from running 'top' that this takes about 2.7GB of RAM. But asizeof reports 1,449,096,184 bytes which is a long way off.

This is what 'top" shows:

enter image description here

/usr/bin/time -v gives:

Maximum resident set size (kbytes): 2858616

Using another way of measuring RAM:

from resource import getrusage, RUSAGE_SELF
print(getrusage(RUSAGE_SELF).ru_maxrss * 1024)

This returns

2927054848

Solution

  • asizeof rather accurately does what it's supposed to do: Measure the total size of the object structure. That's just not all the memory that Python uses.

    I get the exact same total 1,449,096,184 bytes with this minified test (Attempt This Online!):

    from sys import getsizeof
    
    def size(obj, align=8):
        return getsizeof(obj) // -align * -align
    
    a = []
    for _ in range(10_000_000):
        a.append((0, 0, 0, ' ' * 16))
    
    list_size = size(a)
    tuple_size = size(a[0])
    str_size = size(a[0][3])
    ints_size = sum(map(size, range(32)))
    
    print(f'{
        list_size +
        len(a) * (tuple_size + str_size) +
        ints_size
    :,}')
    

    Using align=16 (also in asizeof) would be more realistic, that's likely the alignment your Python uses. I get 1,529,096,192 bytes then.