This may be a stupid question but I will ask it anyway. I have a generator object:
>>> def gen():
... for i in range(10):
... yield i
...
>>> obj=gen()
I can measure it's size:
>>> obj.__sizeof__()
24
It is said that generators get consumed:
>>> for i in obj:
... print i
...
0
1
2
3
4
5
6
7
8
9
>>> obj.__sizeof__()
24
...but obj.__sizeof__()
remains the same.
With strings it works as I expected:
>>> 'longstring'.__sizeof__()
34
>>> 'str'.__sizeof__()
27
I would be thankful if someone could enlighten me.
__sizeof__()
does not do what you think it does. The method returns the internal size in bytes for the given object, not the number of items a generator is going to return.
Python cannot beforehand know the size of a generator. Take for example the following endless generator (example, there are better ways to create a counter):
def count():
count = 0
while True:
yield count
count += 1
That generator is endless; there is no size assignable to it. Yet the generator object itself takes memory:
>>> count().__sizeof__()
168
You don't normally call __sizeof__()
you leave that to the sys.getsizeof()
function, which also adds garbage collector overhead.
If you know a generator is going to be finite and you have to know how many items it returns, use:
sum(1 for item in generator)
but note that that exhausts the generator. If you must have all the values that it generates, you can use list(generator)
and then take the length of the list.