pythonmatplotlibboxplotplot-annotations

How to add annotations to boxplot outliers


I have a plot like the following (using plt.boxplot()): Boxplot with Integer Outliers

Now, what I want is plotting a number how often those outliers occured (preferably to the top right of each outlier).

Is that somehow achievable?


Solution

  • ax.boxplot returns a dictionary of all the elements in the boxplot. The key you need here from that dict is 'fliers'.

    In boxdict['fliers'], there are the Line2D instances that are used to plot the fliers. We can grab their x and y locations using .get_xdata() and .get_ydata().

    You can find all the unique y locations using a set, and then find the number of fliers plotted at that location using .count().

    Then its just a case of using matplotlib's ax.text to add a text label to the plot.

    Consider the following example:

    import matplotlib.pyplot as plt
    import numpy as np
    
    # Some fake data
    data = np.zeros((10000, 2))
    data[0:4, 0] = 1
    data[4:6, 0] = 2
    data[6:10, 0] = 3
    data[0:9, 1] = 1
    data[9:14, 1] = 2
    data[14:20, 1] = 3
    
    # create figure and axes
    fig, ax = plt.subplots(1)
    
    # plot boxplot, grab dict
    boxdict = ax.boxplot(data)
    
    # the fliers from the dictionary
    fliers = boxdict['fliers']
    
    # loop over boxes in x direction
    for j in range(len(fliers)):
    
        # the y and x positions of the fliers
        yfliers = boxdict['fliers'][j].get_ydata()
        xfliers = boxdict['fliers'][j].get_xdata()
    
        # the unique locations of fliers in y 
        ufliers = set(yfliers)
    
        # loop over unique fliers
        for i, uf in enumerate(ufliers):
    
            # print number of fliers
            ax.text(xfliers[i] + 0.03, uf + 0.03, list(yfliers).count(uf))
    
    plt.show()
    

    enter image description here