pythonpandas

How to iterate over Pandas Series generated from groupby().size()


How do you iterate over a Pandas Series generated from a .groupby('...').size() command and get both the group name and count.

As an example if I have:

foo
-1     7
 0    85
 1    14
 2     5

how can I loop over them so that in each iteration I would have -1 & 7, 0 & 85, 1 & 14 and 2 & 5 in variables?

I tried the enumerate option but it doesn't quite work. Example:

for i, row in enumerate(df.groupby(['foo']).size()):
    print(i, row)

it doesn't return -1, 0, 1, and 2 for i but rather 0, 1, 2, 3.


Solution

  • Update:

    Given a pandas Series:

    s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])
    
    s
    #a    1
    #b    2
    #c    3
    #d    4
    #dtype: int64
    

    You can directly loop through it, which yield one value from the series in each iteration:

    for i in s:
        print(i)
    1
    2
    3
    4
    

    If you want to access the index at the same time, you can use either items or iteritems method, which produces a generator that contains both the index and value:

    for i, v in s.items():
        print('index: ', i, 'value: ', v)
    #index:  a value:  1
    #index:  b value:  2
    #index:  c value:  3
    #index:  d value:  4
    
    for i, v in s.iteritems():
        print('index: ', i, 'value: ', v)
    #index:  a value:  1
    #index:  b value:  2
    #index:  c value:  3
    #index:  d value:  4
    

    Old Answer:

    You can call iteritems() method on the Series:

    for i, row in df.groupby('a').size().iteritems():
        print(i, row)
    
    # 12 4
    # 14 2
    

    According to doc:

    Series.iteritems()

    Lazily iterate over (index, value) tuples

    Note: This is not the same data as in the question, just a demo.