pythonpandasmatplotlibseabornhistplot

How to show minimum and maximum values of a column as x ticks at the center of the bins


I would like to display x-axis value on the center of each bin. Besides, minimum and maximum value of the column here flipper_length_mm should be shown first and lastly, respecively. Below you can find my attempt.

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

penguins = sns.load_dataset("penguins")
penguins = penguins.dropna()

max_n = penguins.flipper_length_mm.max()
min_n = penguins.flipper_length_mm.min()
bins = 15
step = (max_n - min_n) / bins
print(min_n, max_n, bins, step)
# 172.0 231.0 15 3.933333333333333

arr_div = np.arange(min_n + step / 2, max_n + step / 2, step=step)
arr_div_r = np.round(arr_div, 0).astype(int)

sns.histplot(data=penguins, x="flipper_length_mm", hue="sex", bins=bins, multiple="stack")
plt.xticks(arr_div, arr_div_r)
# To see the real values without rounding
# plt.xticks(arr_div)

plt.show()

Using the code above, the centered x ticks or x axis values are not minimum or maximum of flipper_length_mm column. I would like to show minimum value as 172 and maximum as 231. How can I achieve this?


Solution

  • You could use minor ticks to add 2 special tick positions at the start and end. Making them larger avoids overlap with the existing (major) ticks and emphasizes they're differently spaced. Optionally, you can also change their color. As the x-axis label is moved by the largest ticks, setting its padding negative avoids superfluous white space.

    (Note that np.arange(start, stop, step) can sometimes include the stop value, due to floating point precision issues. The code below uses a stop value at max_n to make sure the ticks stop earlier.)

    import matplotlib.pyplot as plt
    import seaborn as sns
    import numpy as np
    
    penguins = sns.load_dataset("penguins")
    penguins = penguins.dropna()
    
    max_n = penguins.flipper_length_mm.max()
    min_n = penguins.flipper_length_mm.min()
    bins = 15
    step = (max_n - min_n) / bins
    
    arr_div = np.arange(min_n + step / 2, max_n, step=step)
    arr_div_r = np.round(arr_div, 0).astype(int)
    
    ax = sns.histplot(data=penguins, x="flipper_length_mm", hue="sex", bins=bins, multiple="stack")
    ax.set_xticks(arr_div, arr_div_r)
    ax.set_xticks([min_n, max_n], [round(min_n), round(max_n)], minor=True)
    ax.tick_params(axis='x', which='minor', length=14, color='b', labelcolor='b')
    ax.xaxis.labelpad = -4
    ax.margins(x=0)
    sns.despine()
    plt.tight_layout()
    plt.show()
    

    using minor ticks to show start and end