pythonjsondataframejson-normalize

Is there a way to get the ordinality of the array in the JSON using json_normalize or anyother way of doing it in python?


Is there a way to get the ordinality of the array in the JSON using json_normalize or anyother way of doing it in python ?

I have written the below code to flatten, but I need to add the ordinality, is there a way of doing it simplistic ?

import pandas as pd
import json
json_str = '''{
    "id": "001",
    "Name": "TOM",
"tableData": {
        "subjects": [{
                "Subject": "Maths",
                "alias": "param0"
            }, {
                "Subject": "science",
                "alias": "param1"
                }],
        "marks":[[{
                    "alias": "param0",
                    "value": "85",
                    "Sem" : "first"
                }, {
                    "alias": "param1",
                    "value": "98",
                    "Sem" : "second"
                }       
              ],
              [{
                    "alias": "param0",
                    "value": "88",
                    "Sem" : "first"
                }, {
                    "alias": "param1",
                    "value": "90",
                    "Sem" : "second"
                }       
              ]]
            }  
}'''
dict1 = json.loads(json_str)
dict_subj = pd.json_normalize(dict1,record_path=['tableData','subjects'],meta=['id','Name'])
print(dict_subj)
dict_mark = pd.json_normalize(dict1,record_path=['tableData','marks',[]])
print(dict_mark)
dict_all = pd.merge(dict_subj, dict_mark, on=['alias']) 
print(dict_all) 

current output:

print(dict_mark)

alias value Sem
0 param0 85 first
1 param1 98 second
2 param0 88 first
3 param1 90 second

Needed output :

print(dict_mark)

alias value Sem Ordinality
0 param0 85 first 1
1 param1 98 second 1
2 param0 88 first 2
3 param1 90 second 2

Solution

  • Try:

    import json
    
    dict1 = json.loads(json_str)
    
    df = pd.DataFrame(dict1['tableData']['marks']).T
    df = pd.concat([df[c].apply(pd.Series).assign(Ordinality=i) for i, c in enumerate(df, 1)], ignore_index=True)
    
    print(df)
    

    Prints:

        alias value     Sem  Ordinality
    0  param0    85   first           1
    1  param1    98  second           1
    2  param0    88   first           2
    3  param1    90  second           2