I would like to join thousands of dataframes into one VAEX dataframe Following the documentation I have: https://vaex.readthedocs.io/en/latest/api.html?highlight=concat#vaex.concat
I do:
df_vaex = vaex.DataFrame()
for i,file in enumerate(files):
df = pd.read_pickle(file)
df_vx = vaex.from_pandas(df=df, copy_index=False)
df_vaex.concat(df_vx)
if i%100 == 0:
print(i)
this does not work.
How can I read and concatenate dataframes in vaex?
I get the error that vaex does not have the method concat: AttributeError: 'DataFrame' object has no attribute 'concat'
Second try following the first comment:
for i,file in enumerate(files):
df = pd.read_pickle(file)
df_vaex_total = vaex.from_pandas(df=df, copy_index=False)
if i == 0:
pass
else:
print(type(df_vaex_total)) # its equal to <class 'vaex.dataframe.DataFrameLocal'>
print(type(df_vx)) # its equal to <class 'vaex.dataframe.DataFrameLocal'>
df_vaex_total = pd.concat([df_vaex_total, df_vx])
if i%10 == 0:
print(i)
error: TypeError: cannot concatenate object of type '<class 'vaex.dataframe.DataFrameLocal'>'; only Series and DataFrame objs are valid
If you want to use vaex to concat dataframes you need to do it in the following way:
df_final = vaex.concat(list_of_dataframes)
So your code would look something like this:
list_of_dataframes = []
for i, file in enumerate(files)
pdf = pd.read_pickle(file)
df = vaex.from_pandas(pdf)
list_of_dataframes.append(df)
df_final = vaex.concat(list_of_dataframes)