pandasmatplotlibsquarify

Show multiple columns values on labels with squarify.plot


I have a dataframe that I'd like to plot a tree map with squarify. I'd like to show the country_name and counts on the chart by editing the labels parameter but it seems only taking one value.

Example data

import squarify
import pandas as pd
from matplotlib import pyplot as plt
d = {'country_name':['USA', 'UK', 'Germany'], 'counts':[100, 200, 300]}
dd = pd.DataFrame(data=d)
fig = plt.gcf()
ax = fig.add_subplot()
fig.set_size_inches(16, 4.5)
norm = matplotlib.colors.Normalize(vmin=min(dd.counts), vmax=max(dd.counts))
colors = [matplotlib.cm.Blues(norm(value)) for value in dd.counts]
squarify.plot(label=dd.country_name, sizes=dd.counts, alpha=.7, color=colors)
plt.axis('off')
plt.show()

enter image description here

Expected output will have both counts and country_name on the chart.


Solution

  • You can create a list of labels by looping simultaneously through both columns and composing combined strings. For example:

    
    import squarify
    import pandas as pd
    from matplotlib import pyplot as plt
    import matplotlib
    
    d = {'country_name': ['USA', 'UK', 'Germany'], 'counts': [100, 200, 300]}
    dd = pd.DataFrame(data=d)
    labels = [f'{country}\n{count}' for country, count in zip(dd.country_name, dd.counts)]
    fig = plt.gcf()
    ax = fig.add_subplot()
    fig.set_size_inches(16, 4.5)
    norm = matplotlib.colors.Normalize(vmin=min(dd.counts), vmax=max(dd.counts))
    colors = [matplotlib.cm.Blues(norm(value)) for value in dd.counts]
    squarify.plot(label=labels, sizes=dd.counts, alpha=.7, color=colors)
    plt.axis('off')
    plt.show()
    

    squarify plot with combined labels