pythonjsonpandasdataset

Reading Data from a JSON File in pandas with multiple objects


I am using a JSON file from ACN Data for EV Charging behavior. I want to read it in python and convert it to a pandas dataframe. The problem is that the JSON has multiple objects and I am facing some difficult to summarize all data, including userInputs, in a table row.

JSON below:

{
  "_meta":
    {
      "end": "Sat, 01 Jan 2022 08:00:00 GMT",
      "min_kWh": null,
      "site": "caltech",
      "start": "Mon, 01 Jan 2018 08:00:00 GMT"
    },
  "_items": [
    {
      "_id": "5bc90cb9f9af8b0d7fe77cd2",
      "clusterID": "0039",
      "connectionTime": "Wed, 25 Apr 2018 11:08:04 GMT",
      "disconnectTime": "Wed, 25 Apr 2018 13:20:10 GMT",
      "doneChargingTime": "Wed, 25 Apr 2018 13:21:10 GMT",
      "kWhDelivered": 7.932,
      "sessionID": "2_39_78_362_2018-04-25 11:08:04.400812",
      "siteID": "0002",
      "spaceID": "CA-496",
      "stationID": "2-39-78-362",
      "timezone": "America/Los_Angeles",
      "userID": null,
      "userInputs": null
    },
    {
      "_id": "5ca2ad12f9af8b68e0cb5d47",
      "clusterID": "0039",
      "connectionTime": "Sat, 16 Mar 2019 14:39:41 GMT",
      "disconnectTime": "Sat, 16 Mar 2019 18:39:04 GMT",
      "doneChargingTime": "Sat, 16 Mar 2019 18:25:28 GMT",
      "kWhDelivered": 24.804,
      "sessionID": "2_39_124_22_2019-03-16 14:39:40.648349",
      "siteID": "0002",
      "spaceID": "CA-312",
      "stationID": "2-39-124-22",
      "timezone": "America/Los_Angeles",
      "userID": "000001039",
      "userInputs": [
        {
          "WhPerMile": 271,
          "kWhRequested": 65.04,
          "milesRequested": 240,
          "minutesAvailable": 203,
          "modifiedAt": "Sat, 16 Mar 2019 14:40:30 GMT",
          "paymentRequired": true,
          "requestedDeparture": "Sat, 16 Mar 2019 18:02:41 GMT",
          "userID": 1039
        }
      ]
    }
]}

I have tried reading only "_items" and it worked. But I am still not being able to create a row with all user input data.

Python code below:

import json
import pandas as pd

data = json.load(open("content.json"))

acn_ev_set_json = pd.DataFrame(data)
acn_ev_set_json.tail(3)

The user inputs look like this: enter image description here

Some links that helped me: Dataset: https://ev.caltech.edu/dataset Article Dataset: https://ev.caltech.edu/assets/pub/ACN_Data_Analysis_and_Applications.pdf stackoverflow question: Read JSON to pandas dataframe - ValueError: Mixing dicts with non-Series may lead to ambiguous ordering


Solution

  • Try this:

    pd.concat([acn_ev_set_json.drop(['userInputs'], axis=1), acn_ev_set_json['userInputs'].explode().apply(pd.Series)], axis=1)