pythonlistsumitertools-groupby

Sum categories by unique values in list in python


I have this list:

[('2023-03-15', 'paris', 'flight', 4),
('2023-03-21', 'berlin', 'flight', 2),
('2023-03-01', 'madrid', 'drive', 10),
('2023-03-04', 'madrid', 'cycling', 3),
('2023-03-08', 'rome', 'train', 9),
('2023-03-11', 'amsterdam', 'flight', 5),
('2023-03-14', 'london', 'boat', 1)]

How do you reproduce the same list syntax summing similar activities like "flight", taking the latest date as the date for the totals for each activity in the new list? Summing the integers associated.


Solution

  • Code:

    lis=[('2023-03-15', 'paris', 'flight', 4),
         ('2023-03-21', 'berlin', 'flight', 2),
         ('2023-03-01', 'madrid', 'drive', 10),
         ('2023-03-04', 'madrid', 'cycling', 3),
         ('2023-03-08', 'rome', 'train', 9),
         ('2023-03-11', 'amsterdam', 'flight', 5),
         ('2023-03-14', 'london', 'boat', 1)]
    
    totals={}
    for dt,loc,activ,num in lis:
        if activ in totals:
            totals[activ]['total']+=num
            totals[activ]['latest_date']=max(totals[activ]['latest_date'],dt)
        else:
            totals[activ]={'total': num, 'latest_date': dt}
    
    res=[(totals[activ]['latest_date'], loc, activ, totals[activ]['total']) for dt,loc,activ,num in lis if activ in totals]
    
    print(res)
    

    Output:

    [('2023-03-21', 'paris', 'flight', 11), 
     ('2023-03-21', 'berlin', 'flight', 11), 
     ('2023-03-01', 'madrid', 'drive', 10), 
     ('2023-03-04', 'madrid', 'cycling', 3), 
     ('2023-03-08', 'rome', 'train', 9), 
     ('2023-03-21', 'amsterdam', 'flight', 11), 
     ('2023-03-14', 'london', 'boat', 1)]