I am plotting a treemap and was wondering how to also plot the relative percentage of a tree class, i.e.
Group A =100
Groun B =30
Groun C =50
Groun D =20
Then, in the plot, it should add:
'50%' for Group A
'15%' for Group B
etc. next to its 'Group X' label. How would I do that given this code?
!pip install squarify
import squarify
df = pd.DataFrame({'customers':[8,3,4,2], 'cluster':["group A", "group B", "group C", "group D"] })
squarify.plot(sizes=df['customers'], label=df['cluster'], alpha=.8 )
plt.axis('off')
plt.show();
Assuming that the sum of all values gives 100%, you can change the labels and then plot the newly created labels instead or additional to your descriptors from the dataframe.
Print only percentage values:
lbl = [str('{:5.2f}'.format(i/df['customers'].sum()*100)) + "%" for i in df['customers']]
squarify.plot(sizes=df['customers'], label=lbl, alpha=.8 )
Combined description and percentage values
perc = [str('{:5.2f}'.format(i/df['customers'].sum()*100)) + "%" for i in df['customers']]
lbl = [el[0] + " = " + el[1] for el in zip(df['cluster'], perc)]
squarify.plot(sizes=df['customers'], label=lbl, alpha=.8 )
Update 2021-02-01
As of python version 3.6 the preferred way of formatted string literals are f-strings
. Most of the time, f-strings
are more compact and easier to read. The example with combined description and percentage information looks like this using f-strings
:
perc = [f'{i/df["customers"].sum()*100:5.2f}%' for i in df['customers']]
lbl = [f'{el[0]} = {el[1]}' for el in zip(df['cluster'], perc)]
squarify.plot(sizes=df['customers'], label=lbl, alpha=.8 )
Either way the final result will look similar to this: