pythonpandasloopsbackground-colorpercentile

Colouring background of dataframe cells using percentiles


So I have a timeseries dataframe, df, which has 'n' columns and a load of rows:

df = pd.read_csv('percentiles.csv', index_col=0, parse_dates=True)

The last 3 rows of df look something like this:

ATH ATL 12MH 12ML 3MH 3ML 1MH 1ML
Date
2024-02-16 6 0 8 -1 11 -8 15 -16
2024-02-19 8 -1 10 -2 11 -5 22 -11
2024-02-20 8 0 13 0 16 -2 40 -3

Im trying to export this df, as a table, (pdf), with the backgrounds of the cells coloured depending on how high or low the value is in its column. One way Ive found is to use percentles.

I have made another dataframe, to determine the percentiles of each column:

percentiles = [0, 0.1, 0.2, 0.8, 0.9]
df2 = df.quantile(q=percentiles, axis=0)

df2, containing the percentile values for each column, looks something like this:

ATH ATL 12MH 12ML 3MH 3ML 1MH 1ML
0.0 0.0 -115.0 0.0 -74.0 0.0 -122.0 0.0 -136.0
0.1 0.0 -8.0 0.0 -8.0 1.0 -26.1 4.0 -44.1
0.2 1.8 -4.0 1.0 -4.0 3.0 -14.0 7.0 -28.0
0.8 10.0 0.0 11.0 0.0 20.0 -1.0 33.0 -4.0
0.9 15.0 0.0 16.0 0.0 29.0 0.0 44.1 -2.0

Ive made a dictionary that specifies a color for each percentile:

percentile_color = {0:'red', 0.1: 'orange', 0.2: 'white', 0.8: 'lightgreen', 0.9: 'green'}

Id like to color each cell in each column by the color of its percentile for the column. I can do it for a series (column), but as soon as I make the dataframe, where each column has different values for the percentiles, Im stuck. Any suggestions? Thanks!


Solution

  • You could use styler.apply with a custom function that uses quantile and merge_asof:

    def get_colors(s):
        s = s.astype(float).sort_values()
        return (pd.merge_asof(s.reset_index(), s.quantile(q=percentiles).reset_index())
                  .set_index('Date')['index']
                  .map(lambda x: f'background-color: {percentile_color.get(x, "")}')
               )
    
    df.style.apply(get_colors)
    

    Output:

    color dataframe per quantile in column

    Note that you can choose if a cell gets the color of the previous or next quantile with direction:

    def get_colors(s):
        s = s.astype(float).sort_values()
        return (pd.merge_asof(s.reset_index(), s.quantile(q=percentiles).reset_index(),
                             direction='forward')
                  .set_index('Date')['index']
                  .map(lambda x: f'background-color: {percentile_color.get(x, "")}')
               )
    
    df.style.apply(get_colors)
    

    Output:

    color dataframe per quantile in column

    And, for duplicated quantiles you can force getting the lowest one with:

    def get_colors(s):
        s = s.astype(float).sort_values()
        return (pd.merge_asof(s.reset_index(),
                              s.quantile(q=percentiles).drop_duplicates().reset_index())
                  .set_index('Date')['index']
                  .map(lambda x: f'background-color: {percentile_color.get(x, "")}')
               )
    
    df.style.apply(get_colors)
    

    Output:

    color dataframe per quantile in column