pythonmemory-managementmemory-size

Python: How to estimate / calculate memory footprint of data structures?


What's a good way to estimate the memory footprint of an object?

Conversely, what's a good way to measure the footprint?

For example, say I have a dictionary whose values are lists of integer,float tuples:

d['key'] = [ (1131, 3.11e18), (9813, 2.48e19), (4991, 9.11e18) ]

I have 4G of physical memory and would like to figure out approximately how many rows (key:values) I can store in memory before I spill into swap. This is on linux/ubuntu 8.04 and OS X 10.5.6 .

Also, what's the best way to figure out the actual in-memory footprint of my program? How do I best figure out when it's exhausting physical memory and spilling?


Solution

  • Guppy has a nice memory profiler (Heapy):

    >>> from guppy import hpy
    >>> hp = hpy()
    >>> hp.setrelheap() # ignore all existing objects
    >>> d = {}
    >>> d['key'] = [ (1131, 3.11e18), (9813, 2.48e19), (4991, 9.11e18) ]
    >>> hp.heap()
     Partition of a set of 24 objects. Total size = 1464 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      2   8      676  46       676  46 types.FrameType
         1      6  25      220  15       896  61 str
         2      6  25      184  13      1080  74 tuple
     ...
    

    Heapy is a little underdocumented, so you might have to dig through the web page or source code a little, but it's very powerful. There are also some articles which might be relevant.