pythonplotlyiqr

Plotly: How to change length of whiskers (min/max) in a boxplot?


I know that 1.5 * IQR is a common rule, but I would like to plot other min/max if possible. I am using plotly (python). Basically, I would like to define a function to show the boxplot by the parameters data frame, column, and a self-defined multiplier.

df_test = pd.Series(np.array([26124.0, 8124.0, 27324.0, 13188.0, 21156.0]))

def get_boxplot(df,column, multiplier):
    data = [go.Box(y=df[column],boxpoints="outliers")]
    return pyo.plot(data)

get_boxplot(df_test,0,3)

My goal is to replace 1.5 * IQR by the multiplier parameter. In this example by 3 or any other number.

Do you have an idea of how to change my function?

Thank you!


Solution

  • Getting the exact result you are looking for does not seem to be possible within the boundaries of python, meaning that the properties at best are only available in the javascript context.

    You still have som options regarding the placement of the whiskers, though. And you are right by the way about the 1.5 * IQR part. From help(fig) you can find:

    By default, the whiskers correspond to the box' edges +/- 1.5 times the interquartile range (IQR: Q3-Q1), see "boxpoints" for other options.

    And under boxpoints you'll find:

    If "outliers", only the sample points lying outside the whiskers are shown If "suspectedoutliers", the outlier points are shown and points either less than 4*Q1-3*Q3 or greater than 4*Q3-3*Q1 are highlighted (see outliercolor) If "all", all sample points are shown If False, only the box(es) are shown with no sample points

    So for the different values of

    'boxpoints': False, 'all', outliers you'll get:

    enter image description here

    And as you'll se below, whether or not boxpoints are shown will also determine the placement of the whiskers. So you could use False, 'all', outliers as arguments in a custom function to at least be able to change between those options. And judging by your question boxpoints=False shouldn't be too far off target.

    Here's a way to do it:

    Code with boxpoints set to False:

    # imports
    from plotly.subplots import make_subplots
    import plotly.graph_objs as go
    import pandas as pd
    import numpy as np
    
    # data
    np.random.seed(123)
    y0 = np.random.randn(50)-1
    x0 = y0
    x0 = [0 for y in y0]
    y0[-1] = 4 # include an outlier
    
    # custom plotly function
    def get_boxplot(boxpoints):
        fig = go.Figure(go.Box(y=y0, boxpoints = boxpoints, pointpos = 0,
                               )
                       )
    
        fig.show()
    
    get_boxplot(boxpoints='outliers')
    

    Plot 1 - Boxpoints = False:

    enter image description here

    Plot 1 - Boxpoints = 'outliers': enter image description here

    This will raise another issue though, since the markers by default are not shown in the first case. But you can handle that by including another trace like this:

    Complete plot:

    enter image description here

    Complete code:

    # imports
    from plotly.subplots import make_subplots
    import plotly.graph_objs as go
    import pandas as pd
    import numpy as np
    
    # data
    np.random.seed(123)
    y0 = np.random.randn(50)-1
    x0 = y0
    x0 = [0 for y in y0]
    y0[-1] = 4 # include an outlier
    
    # custom plotly function
    def get_boxplot(boxpoints):
        fig = go.Figure(go.Box(y=y0, boxpoints = boxpoints, pointpos = 0,
                               )
                       )
    
        if boxpoints==False:
            fig.add_trace(go.Box(x=x0,
                            y=y0, boxpoints = 'all', pointpos = 0,
                            marker = dict(color = 'rgb(66, 66, 244)'),
                            line = dict(color = 'rgba(0,0,0,0)'),
                            fillcolor = 'rgba(0,0,0,0)'
                        ))
    
        get_boxplot.show()
    
    foo(boxpoints=False)