pythonjsondataframegraphqlmonday.com

How do I normalize deeply nested JSON results that has a lot of different collections?


What I have is a result from monday.com via a graphql query. The data comes back with a lot of different collections such as a dictionary of a list of dictionaries... eg.

{
  "data": {
    "boards": [
      {
        "name": "board name",
        "groups": [
                    {"title": "group title 1"},
                    {"title": "group title 2"},
                    {"title": "group title 3"},
                    {"title": "group title 4"}
                  ]
       }
     ]
   },
  "account_id": "0000000"
}

and I want to make a pandas dataframe like

     group.board    group.title
0    'board_name'   'group title 1'
1    'board_name'   'group title 2'
2    'board_name'   'group title 3'
3    'board_name'   'group title 4'

I've tried

pd.json_normalize(json_data, 'boards')

But I keep getting KeyError: 'boards'


Solution

  • Your json dict is invalid (missing a closing brace, so I fixed it). I would do it like this. Here we are going into the dict at json_data["data"]["boards"] because that's where the data is, using "groups" as the records key, and using the field "name" as one of the metadata fields.

    import pandas as pd
    
    json_data = {
        "data": {
            "boards": [
                {
                    "name": "board name",
                    "groups": [
                        { "title": "group title 1" },
                        { "title": "group title 2" },
                        { "title": "group title 3" },
                        { "title": "group title 4" }
                    ]
                },
            ]
        },
        "account_id": "0000000"
    }
    
    pd.json_normalize(json_data["data"]["boards"], "groups", ["name"])
    

    Output:

    
        title           name
    0   group title 1   board name
    1   group title 2   board name
    2   group title 3   board name
    3   group title 4   board name