So I have a timeseries dataframe, df, which has 'n' columns and a load of rows:
df = pd.read_csv('percentiles.csv', index_col=0, parse_dates=True)
The last 3 rows of df look something like this:
ATH | ATL | 12MH | 12ML | 3MH | 3ML | 1MH | 1ML | |
---|---|---|---|---|---|---|---|---|
Date | ||||||||
2024-02-16 | 6 | 0 | 8 | -1 | 11 | -8 | 15 | -16 |
2024-02-19 | 8 | -1 | 10 | -2 | 11 | -5 | 22 | -11 |
2024-02-20 | 8 | 0 | 13 | 0 | 16 | -2 | 40 | -3 |
Im trying to export this df, as a table, (pdf), with the backgrounds of the cells coloured depending on how high or low the value is in its column. One way Ive found is to use percentles.
I have made another dataframe, to determine the percentiles of each column:
percentiles = [0, 0.1, 0.2, 0.8, 0.9]
df2 = df.quantile(q=percentiles, axis=0)
df2, containing the percentile values for each column, looks something like this:
ATH | ATL | 12MH | 12ML | 3MH | 3ML | 1MH | 1ML | |
---|---|---|---|---|---|---|---|---|
0.0 | 0.0 | -115.0 | 0.0 | -74.0 | 0.0 | -122.0 | 0.0 | -136.0 |
0.1 | 0.0 | -8.0 | 0.0 | -8.0 | 1.0 | -26.1 | 4.0 | -44.1 |
0.2 | 1.8 | -4.0 | 1.0 | -4.0 | 3.0 | -14.0 | 7.0 | -28.0 |
0.8 | 10.0 | 0.0 | 11.0 | 0.0 | 20.0 | -1.0 | 33.0 | -4.0 |
0.9 | 15.0 | 0.0 | 16.0 | 0.0 | 29.0 | 0.0 | 44.1 | -2.0 |
Ive made a dictionary that specifies a color for each percentile:
percentile_color = {0:'red', 0.1: 'orange', 0.2: 'white', 0.8: 'lightgreen', 0.9: 'green'}
Id like to color each cell in each column by the color of its percentile for the column. I can do it for a series (column), but as soon as I make the dataframe, where each column has different values for the percentiles, Im stuck. Any suggestions? Thanks!
You could use styler.apply
with a custom function that uses quantile
and merge_asof
:
def get_colors(s):
s = s.astype(float).sort_values()
return (pd.merge_asof(s.reset_index(), s.quantile(q=percentiles).reset_index())
.set_index('Date')['index']
.map(lambda x: f'background-color: {percentile_color.get(x, "")}')
)
df.style.apply(get_colors)
Output:
Note that you can choose if a cell gets the color of the previous or next quantile with direction
:
def get_colors(s):
s = s.astype(float).sort_values()
return (pd.merge_asof(s.reset_index(), s.quantile(q=percentiles).reset_index(),
direction='forward')
.set_index('Date')['index']
.map(lambda x: f'background-color: {percentile_color.get(x, "")}')
)
df.style.apply(get_colors)
Output:
And, for duplicated quantiles you can force getting the lowest one with:
def get_colors(s):
s = s.astype(float).sort_values()
return (pd.merge_asof(s.reset_index(),
s.quantile(q=percentiles).drop_duplicates().reset_index())
.set_index('Date')['index']
.map(lambda x: f'background-color: {percentile_color.get(x, "")}')
)
df.style.apply(get_colors)
Output: