I have a set of images. In each image, a program finds objects with attributes X
and type
. The number of objects vary from image to image. Hence for one image I have a df_objects
with N_objects
rows and 2 columns X
and type
.
Then I build a df_images
with the images as rows, with columns time
, objects
where the entry for objects is the df_objects
above. This works very well inside the program. Of course the interest is to store the structure, and I tried to run, DataFrame.to_csv
.
Then I read it by pd.read_csv
. It seems to work for example using the read df_images
, I can print the df_objects
of image 1. But not quite: df_objects["type"]
is not accepted and generates an error:
TypeError: string indices must be integers
Although the code is strictly identical to that tested on the original df. See code below. Thanks!
import pandas as pd
df1 = pd.DataFrame({"X":(1.1,1.2),"type":("a_1","b_1")})
print(' df1')
print(df1)
df2 = pd.DataFrame({"X":(2.1,2.2,2.3),"type":("a_2","b_2","c_2")})
print(' df2')
print(df2)
print(' ')
dfT = pd.DataFrame ({"time":(6,7),"dff":(df1,df2)})
df1_test = dfT["dff"][0]
print(' df1_test')
print(df1_test)
df2_test = dfT["dff"][1]
print(' df2_test')
print(df2_test)
print(' ')
type_list_evt_1 = df1_test["type"]
print(' type_list_evt_1')
print(type_list_evt_1)
print(' ')
dfT.to_csv(path_or_buf = "test_dff.csv", index = "False")
read_dfT = pd.read_csv('test_dff.csv')
df1_read = read_dfT["dff"][0]
print(' df1_read')
print(df1_read)
df2_read = read_dfT["dff"][1]
print(' df2_read')
print(df2_read)
print(' ')
type_list_evt_1_read = df1_read["type"]
print(' type_list_evt_1_read')
print(type_list_evt_1_read)
I would like the df read back to behave strictly as the df written
If you prefer format easy ti inspect and edit, you can use JSON. here each df_objects
can be stored as JSON within the main DataFrame.
for ex:
import pandas as pd
import json
from io import StringIO
df1 = pd.DataFrame({"X": [1.1, 1.2], "type": ["a_1", "b_1"]})
df2 = pd.DataFrame({"X": [2.1, 2.2, 2.3], "type": ["a_2", "b_2", "c_2"]})
df1_json = df1.to_json(orient='split')
df2_json = df2.to_json(orient='split')
df_images = pd.DataFrame({
"time": [6, 7],
"objects": [df1_json, df2_json]
})
# Save DataFrame as CSV
df_images.to_csv("df_images.csv", index=False)
read_df_images = pd.read_csv("df_images.csv")
read_df_images["objects"] = read_df_images["objects"].apply(lambda x: pd.read_json(StringIO(x), orient='split'))
df1_read = read_df_images["objects"][0]
print("df1_read")
print(df1_read)
type_list_evt_1_read = df1_read["type"]
print("type_list_evt_1_read")
print(type_list_evt_1_read)
Hope this helps.