pythonpython-itertoolsitertools-groupby

Python - itertools.groupby 2


Just having trouble with itertools.groupby. Given a list of dictionaries,

my_list= [ 
"AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]

from this list, I expected to create a dictionary, where the key is the shortest name and the values ​​are the longest names

example

[
{"Legacy" : "AD01", "rphy" : ["AD01AA", "AD01AB", "AD01AC", "AD01AD"]},
{"Legacy" : "AD02", "rphy" : ["AD02AA", "AD02AB", "AD02AC"]},
]

could you help me please


Solution

  • You can use itertools.groupby, with some nexts:

    from itertools import groupby
    
    my_list= ["AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]
    
    groups = groupby(my_list, len)
    output = [{'Legacy': next(g), 'rphy': list(next(groups)[1])} for _, g in groups]
    
    print(output)
    # [{'Legacy': 'AD01', 'rphy': ['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']},
    #  {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]
    

    This is not robust to reordering of the input list.

    Also, if there is some "gap" in the input, e.g., if "AD01" does not have corresponding 'rphy' entries, then it will throw a StopIteration error as you have found out. In that case you can use a more conventional approach:

    from itertools import groupby
    
    my_list= ["AD01", "AD02", "AD02AA", "AD02AB", "AD02AC"]
    
    output = []
    for item in my_list:
        if len(item) == 4:
            dct = {'Legacy': item, 'rphy': []}
            output.append(dct)
        else:
            dct['rphy'].append(item)
    
    print(output)
    # [{'Legacy': 'AD01', 'rphy': []}, {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]