pythondictionarypython-itertoolsordereddictionary

Aggregating and calculating in OrderedDict in Python


Have a OrderedDict "d" looking like that:

[OrderedDict([
              ('id', '1'),
              ('date', '20170101'),
              ('quantity', '10')]),
 OrderedDict([
              ('id', '2'),
              ('date', '20170102'),
              ('quantity', '3')]),
 OrderedDict([
              ('id', '3'),
              ('date', '20170102'),
              ('quantity', '1')])]

I'm trying to do the group by 'date' and calculating the sum of quantity and display these two columns 'date' and 'sum_quantity'. How can I do that not using pandas groupby options?

Thanks!


Solution

  • Here is pure python approach, This is just an example to give you a hint. If you want in pure python you can use this.

    from collections import OrderedDict
    import itertools
    data=[OrderedDict([
                  ('id', '1'),
                  ('date', '20170101'),
                  ('quantity', '10')]),
     OrderedDict([
                  ('id', '2'),
                  ('date', '20170102'),
                  ('quantity', '3')]),
     OrderedDict([
                  ('id', '3'),
                  ('date', '20170102'),
                  ('quantity', '1')])]
    
    
    
    def get_quantity(ord_dict):
        new_ = []
        for g in [list(i) for j, i in itertools.groupby(ord_dict, lambda x: x['date'])]:
            if len(g) > 1:
                sub_dict={}
                temp = []
                date = []
                for i in g:
                    temp.append(int(i['quantity']))
                    date.append(i['date'])
                sub_dict['date'] = date[0]
                sub_dict['sum_quantity'] = sum(temp)
                new_.append(sub_dict)
    
    
            else:
                for i in g:
                    sub_dict={}
                    sub_dict['date']=i['date']
                    sub_dict['sum_quantity']=i['quantity']
                    new_.append(sub_dict)
    
        return new_
    print(get_quantity(data))
    

    output:

    [{'date': '20170101', 'sum_quantity': '10'}, {'date': '20170102', 'sum_quantity': 4}]