I know read csv file in datatable is much faster than pandas DataFrame.
However, in my case
I have several csv files and i have to append one by one all of them.
So i am doing append all of these pd.read_csv(file) to empty DataFrame.
Will it be faster read csv file with datatable and append it to empty datatble
and then finally convert final datatable to csv?
So i want to know the fastest way to append csv file except pandas DataFrame
This is what I do when I have lots of csv
files.
I use glob
to grab all the csv file paths:
from glob import glob
all_csvs = glob('path-to-folder-containing-csv-files/*.csv')
Now read all of them and append them.
all_csvs_appended = dt.rbind(iread(all_csvs))
If all your csv files do not have the same columns, you may need to add force=True
to rbind
.