pythonjsonpandascsv

read nested json inside csv file using pandas?


I have a csv file that with rows that looks like this:

745198;2024-09-10 10:09:10.7;leaf-2;{"Accelerometer": {"X": 0.055297852, "Y": 0.993530273, "Z": 0.000244141}}
745199;2024-09-10 10:09:10.71;leaf-2;{"Accelerometer": {"X": 0.056274414, "Y": 0.994384766, "Z": 0.000976563}}
745200;2024-09-10 10:09:10.721;leaf-2;{"Accelerometer": {"X": 0.055786133, "Y": 0.994018555, "Z": 0.000854492}}
745201;2024-09-10 10:09:10.732;leaf-2;{"Accelerometer": {"X": 0.055053711, "Y": 0.993530273, "Z": 0.000854492}}

and I would like to read this data into a dataframe, and somehow get the 3 accelerometer data into separate columns, but I haven't been able to figure out a good way of doing this. I have searched for similar cases Split / Explode a column of dictionaries into separate columns with pandas But none of the suggested solutions seems to work.

I can create a for loop and manually extract the information I need line by line, but I'm guessing there should be a nice fast way of doing this that I just do not know of.


Solution

  • First convert the JSON string in the 'Accelerometer' column to a dictionary:

    df = pd.read_csv('file.csv', sep=';', header=None, names=['ID', 'Timestamp', 'Device', 'Accelerometer'])
    
    df['Accelerometer'] = df['Accelerometer'].apply(json.loads)
    

    Then normalize the 'Accelerometer' column to separate columns:

    accelerometer_df = pd.json_normalize(df['Accelerometer'])
    accelerometer_df.columns = ['X', 'Y', 'Z']
    

    Finally concatenate the original DataFrame with the new accelerometer columns:

    result_df = pd.concat([df.drop(columns=['Accelerometer']), accelerometer_df], axis=1)
    print(result_df)
    

    The output:

           ID                Timestamp  Device         X         Y         Z
    0  745198    2024-09-10 10:09:10.7  leaf-2  0.055298  0.993530  0.000244
    1  745199   2024-09-10 10:09:10.71  leaf-2  0.056274  0.994385  0.000977
    2  745200  2024-09-10 10:09:10.721  leaf-2  0.055786  0.994019  0.000854
    3  745201  2024-09-10 10:09:10.732  leaf-2  0.055054  0.993530  0.000854