I'm running a script in Jupyter which is expected to show a progressbar while applying a function to a df. In 'out' I see several bars instead of expected one. I tried to clear 'out' with import sys sys.stdout.flush() but it increases time significantly.
When I create a df with less rows, say 100 - there is only one bar. When I increase the number of rows, the bars appear more times.
What is the problem, please?
out screenshot
import pandas as pd
import math
iterator_for_progressbar = 1
def progressBar(current, total, barLength = 20):
percent = math.ceil(float(current) * 100 / total)
arrow = '■' * int(percent/100 * barLength)
spaces = '□' * (barLength - len(arrow))
print('Calculating: %s%s %d %%' % (arrow, spaces, percent), end='\r')
def myf(row):
global iterator_for_progressbar
progressBar(iterator_for_progressbar, len(df), barLength = 20)
iterator_for_progressbar += 1
row['1'] = 100
df = pd.DataFrame(index = range(0, 5000), columns = ['1','2','3','4','5'] )
df.apply(myf, axis=1)[enter image description here]
You could use the tqdm
library for the progress bar instead of creating your own.
from tqdm.notebook import tqdm
for the Jupyter version.tqdm.pandas
integrates with pandas operations.def myf(row):
row['1'] = 100
return row
tqdm.pandas(desc="Calculating")
df = df.progress_apply(myf, axis=1)