pythonpython-3.xpython-itertools

Python group by values using itertools groupby function


Python code is below.

from itertools import groupby
data = [('a', 1), ('b', 2), ('b', 3), ('c', 4), ('c', 5)]
_sorted_data = sorted(data,key= lambda element : element[0])
_res = groupby(_sorted_data,key=lambda x : x[0])
for key,value in _res:
    print(key,list(value))

Got below result.

a [('a', 1)]
b [('b', 2), ('b', 3)]
c [('c', 4), ('c', 5)]

my expected result is

a [1]
b [2,3]
c [4,5]

How can transform the 'Values'


Solution

  • It looks like your actual input is unsorted since you're calling sorted first in your code, resulting in an overall time complexity of O(n log n) despite a linear time complexity when performing itertools.groupby.

    You can instead iterate over the input list and aggregate the items in a dict for an overall linear time complexity:

    data = [('a', 1), ('b', 2), ('b', 3), ('c', 4), ('c', 5)]
    result = {}
    for key, value in data:
        result.setdefault(key, []).append(value)
    for key, values in result.items():
        print(key, values)
    

    This outputs:

    a [1]
    b [2, 3]
    c [4, 5]
    

    Demo: https://ideone.com/1D7gRS