pythonjsonpandasdataframeparsing

How to extract multiple JSON objects from one file?


I am very new to Json files. If I have a json file with multiple json objects such as following:

{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
 "Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
 "Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
 "Code":[{"event1":"B","result":"0"},…]}
…

I want to extract all "Timestamp" and "Usefulness" into a data frames:

    Timestamp    Usefulness
 0   20140101      Yes
 1   20140102      No
 2   20140103      No
 …

Does anyone know a general way to deal with such problems?


Solution

  • Use a json array, in the format:

    [
    {"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
      "Code":[{"event1":"A","result":"1"},…]},
    {"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
      "Code":[{"event1":"B","result":"1"},…]},
    {"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
      "Code":[{"event1":"B","result":"0"},…]},
    ...
    ]
    

    Then import it into your python code

    import json
    
    with open('file.json') as json_file:
    
        data = json.load(json_file)
    

    Now the content of data is an array with dictionaries representing each of the elements.

    You can access it easily, i.e:

    data[0]["ID"]