I'm using pandas to read a large size file,the file size is 11 GB
chunksize=100000
for df_ia in pd.read_csv(file, chunksize=n,
iterator=True, low_memory=False):
My question is how to get the amount of the all the chunks,now what I can do is setting a index and count one by one,but this looks not a smart way:
index = 0
chunksize=100000
for df_ia in pd.read_csv(file, chunksize=n,
iterator=True, low_memory=False):
index + =1
So after looping the whole size file the final index will be the amount of all the chunks,but is there any faster way to direct get it ?
You can use the enumerate
function like:
for i, df_ia in enumerate(pd.read_csv(file, chunksize=5,
iterator=True, low_memory=False)):
Then after you finish iteration, the value of i
will be len(number_of_dataframes)-1
.