pythonpandasjson-normalize

How to groupby two columns of a dataframe and convert other columns into dict with column header as key


Dataframe:

id      id_2    salary  title   allowance   name
0420    13.28   100000  director    No      Tom
0420    13.28   70000   developer   Yes     Sam
0110    13.12   120000  director    No      Dave
0110    13.12   75000   developer   Yes     shaun 

Groupby id and id_2 and convert the rest of columns to dict with column header

I wrote a loop for this, I think don't think it is python way, Please let me know how to do it with pandas.

Required output:

[{
            "id": 420,
            "id_2": 13.28,
            "attributes":[
                    {   "salary": 100000,
                        "title":"director",
                        "allowance":"No",
                        "name": "Tom"
                    },
                    {   "salary": 70000,
                        "title": "developer",
                        "allowance":"Yes",
                        "name": "Sam"
                    }
                ]
            },
            {
            "id": 110,
            "id_2": 13.12,
            "attributes":[
                    {   "salary": 120000,
                        "title":"director",
                        "allowance":"No",
                        "name": "Dave"
                    },
                    {   "salary": 75000,
                        "title": "developer",
                        "allowance":"Yes",
                        "name": "shaun"
                    }
                ]
            }   
]

Solution

  • dict_list = list()
    groupby_list = ['id', 'id_2']
    for g, d in df.groupby(groupby_list):
        group = dict(zip(groupby_list, g))
        att_list = list()
        for _, data in d.iterrows():
            data = data.drop(labels=groupby_list)
            att_list.append(data.to_dict())
        group['attributes'] = att_list
        dict_list.append(group)
    

    dict_list:

    [{'attributes': [{'allowance': 'No',
                      'name': 'Dave',
                      'salary': 120000,
                      'title': 'director'},
                     {'allowance': 'Yes',
                      'name': 'shaun',
                      'salary': 75000,
                      'title': 'developer'}],
      'id': 110,
      'id_2': 13.12},
     {'attributes': [{'allowance': 'No',
                      'name': 'Tom',
                      'salary': 100000,
                      'title': 'director'},
                     {'allowance': 'Yes',
                      'name': 'Sam',
                      'salary': 70000,
                      'title': 'developer'}],
      'id': 420,
      'id_2': 13.28}]