pythonseaborn

Two sided seaborn barplot not referencing proper y-values


Trying to make a two-sided barplot using Seaborn in Python and it doesn't appear to be using the proper levels for one side of the plot.

Data looks like this:

    Time    Symbol  Position    Operation   Side    Price   Size
0   2023-07-25 15:09:12.249964  MCDU3   0   0   1   0.7595  -2
1   2023-07-25 15:09:12.255196  MCDU3   1   0   1   0.7594  -7
2   2023-07-25 15:09:12.258575  MCDU3   2   0   1   0.7593  -8
3   2023-07-25 15:09:12.267100  MCDU3   3   0   1   0.7592  -16
4   2023-07-25 15:09:12.270027  MCDU3   4   0   1   0.7591  -14
5   2023-07-25 15:09:12.272276  MCDU3   5   0   1   0.759   -407
6   2023-07-25 15:09:12.274441  MCDU3   6   0   1   0.7589  -14
7   2023-07-25 15:09:12.276581  MCDU3   7   0   1   0.7588  -14
8   2023-07-25 15:09:12.278742  MCDU3   8   0   1   0.7587  -264
9   2023-07-25 15:09:12.280768  MCDU3   9   0   1   0.7586  -15
10  2023-07-25 15:09:12.283094  MCDU3   0   0   0   0.7596  102
11  2023-07-25 15:09:12.286398  MCDU3   1   0   0   0.7597  8
12  2023-07-25 15:09:12.289751  MCDU3   2   0   0   0.7598  8
13  2023-07-25 15:09:12.292842  MCDU3   3   0   0   0.7599  17
14  2023-07-25 15:09:12.295488  MCDU3   4   0   0   0.76    409
15  2023-07-25 15:09:12.297606  MCDU3   5   0   0   0.7601  16
16  2023-07-25 15:09:12.299546  MCDU3   6   0   0   0.7602  16
17  2023-07-25 15:09:12.302073  MCDU3   7   0   0   0.7603  14
18  2023-07-25 15:09:12.305483  MCDU3   8   0   0   0.7604  14
19  2023-07-25 15:09:12.307733  MCDU3   9   0   0   0.7605  658

Code looks like this--not clear to me why Price levels for first Side are being used instead of the actual Price present for second plot.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

f, ax = plt.subplots()
sns.set_color_codes('muted')
# d.loc[d.Side==1,'Size'] = d[d.Side==1].Size*-1

sns.barplot(data = d[d.Side==1], x = 'Size', y = 'Price', color = 'b', orient = 'h')
sns.barplot(data = d[d.Side==0], x = 'Size', y = 'Price', color = 'r', orient = 'h')

Output: enter image description here


Solution

  • By default, the bar positions are categorical (internally numbered 0, 1, 2, ... and then getting a string label). The new native_scale parameter gives numerical positions. The width of the bars depends on the bar positions that are closest together.

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    
    sns.barplot(data=d[d.Side == 1], x='Size', y='Price', color='b', orient='h', native_scale=True)
    sns.barplot(data=d[d.Side == 0], x='Size', y='Price', color='r', orient='h', native_scale=True)
    
    plt.show()
    

    sns.barplot with native_scale=True

    PS: Without native_scale, a similar plot can be created with matplotlib:

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    sns.set_style('whitegrid')
    sns.set_color_codes('muted')
    
    barwidth = 0.8 * np.min(np.diff(np.unique((d['Price']))))
    plt.barh('Price', 'Size', color='b', data=d[d.Side == 1], height=barwidth)
    plt.barh('Price', 'Size', color='r', data=d[d.Side == 0], height=barwidth)
    sns.despine()
    
    plt.show()
    

    double barh plot via matplotlib