pythonmatplotlibseabornheatmapcolorbar

Seaborn heatmap with a logarithmic colorbar with each color resembling one order of magnitude?


I´m trying to plot some data in a heatmap. The expected data can be in the range between 1x100 and 1x109, so I´d like to use a discrete logarithmic color scale where one color represents one order of magnitude.

Some example data:

1   9.233444e+05  1.667961e+05  1.000000e+00       1.000000  1.628000e+05   49113.404001  1.297632e+06  1.417425e+08  9.874216e+06  6.604796e+05  244640.784032  2.644814e+06  302680.552383  4.179854e+05  157001.632229    
2   1.035594e+06  1.000000e+00  1.000000e+00       1.000000  1.678359e+05   53792.826077  1.546623e+06  1.605406e+08  1.360309e+07  8.274078e+05  387068.818619  3.309055e+06  327772.433710  5.212390e+05  183668.570530    
3   1.757615e+06  2.674736e+05  1.000000e+00   21193.964624  2.619884e+05   87328.604686  1.382115e+06  2.223598e+08  1.332000e+07  1.164365e+06  477190.744769  6.037433e+06  574947.946448  9.707206e+05  339096.608121    
4   1.568719e+06  2.586641e+05  1.000000e+00       1.000000  2.269648e+05  111880.599266  1.559659e+06  2.214848e+08  1.681052e+07  1.013042e+06  365251.506398  5.524255e+06  498559.533033  8.148302e+05  273630.913931    
5   1.836254e+06  3.793271e+05  1.000000e+00       1.000000  1.000000e+00  164330.522936  2.023138e+06  2.859675e+08  1.741852e+07  2.131141e+06  597542.798838  4.079047e+06  347385.192677  8.431494e+05  194838.752708    
6   2.384388e+06  1.000000e+00  1.000000e+00       1.000000  4.542583e+05  262315.667507  2.814922e+06  3.351289e+08  1.002064e+07  5.739982e+05   67494.501506  1.013531e+06  965788.960452  7.560733e+04  483717.390066    
7   3.025885e+06  1.626263e+06  5.567249e+06  278075.276104  5.024219e+05  510559.593751  2.269265e+06  1.052652e+07  1.000000e+00  1.000000e+00  863660.428694  1.188858e+07       1.000000  2.953666e+06       1.000000    
8   3.665986e+06  1.704403e+06  7.424446e+06  173460.962223  9.440780e+05  175174.671543  1.000000e+00  1.864523e+06  1.000000e+00  1.000000e+00  179608.650232  2.177289e+06       1.000000  5.171294e+05       1.000000    
9   4.206076e+06  1.971293e+06  8.298467e+06  835034.598981  1.675579e+06       1.000000  1.569502e+06  1.000000e+00  1.000000e+00  1.000000e+00       1.000000  6.342325e+05       1.000000  1.000000e+00       1.000000    
10  4.972394e+06  2.009257e+06  7.716523e+06  670068.164179  7.422010e+05  145980.973202  3.615038e+06  1.048249e+06  1.000000e+00  1.000000e+00       1.000000  4.702494e+05       1.000000  1.000000e+00       1.000000 

I wrote the following code:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm

data=pd.read_excel('example_data.xlsx')

custom_color_map=sns.color_palette("rocket", as_cmap=False, n_colors=9)
fig, ax = plt.subplots(figsize=(20,10))
ax = sns.heatmap(data, norm=LogNorm(), annot=True, cmap = custom_color_map, linewidths=0.05, linecolor='grey')

plt.tight_layout()

plt.show() 

This resulted in the following heatmap:

heatmap

The lower end of the colorbar is as desired, the first color starts at 100. But then, the second color does not start at 101, but somewhere below 101. This trend goes on for the following colors.

The main problem, however, is that the mapping of values to colors is unexpected, which can be seen looking at the annotated values: In column 0 of the heatmap, there are the values 6.3x105 and 9.2x105 in the same orange color as the following 1x106, etc. Then, the last value in this column, 5x106, is has a different color, despite being in the same order of magnitude as 1x106.

Why is this and how can I adjust the color mapping so that one color matches one order of magnitude?


Solution

  • The thing is you data doesn't reach 10⁹. And colormap is adjusted on the data. How could one (or one's code) even guess that you expect the 9 colors values to range from 10¹ to 10⁹ exactly, from your code? This information appears nowhere. All you said is that you wanted 9 different colors, and you wanted them applied on a logarithmic scale. You data could range from 0.1 to 2, or from 10¹ to 10³⁰, and that code is supposed to then provide a meaningful colormap for those data, right?

    There is an easy solution, tho, if you want to basically say to the heatmap (or, more accurately, in your case, to the Log Normalizer) "pretend my data range from 10¹ to 10⁹, even if it is not". That is to pass a vmin/vmax parameter (not to the heatmap itself, as we usually do, but to LogNorm, since if you passed a norm parameter, it is its role to decide how values are mapped to a [0,1] interval)

    So, short answer

    ax = sns.heatmap(data, norm=LogNorm(vmin=1, vmax=1e9), annot=True, cmap = custom_color_map, linewidths=0.05, linecolor='grey')
    

    My own [mre] full code (ideally a [mre] as inlined data. In this case it is quite easy to do)

    import seaborn as sns
    import matplotlib.pyplot as plt
    from matplotlib.colors import LogNorm
    import numpy as np
    
    data=np.clip(np.random.lognormal(10.3, 5, (10,10)), 1, 5e8) # that lognormal is >99.99999% likely to range from 1 to 10⁹. But then I clip to [1,5×10⁸] interval to imitate your problem: my data doesn't exacty goes up to 10⁹
    custom_color_map=sns.color_palette("rocket", as_cmap=False, n_colors=9)
    fig, ax = plt.subplots(figsize=(20,10))
    ax = sns.heatmap(data, norm=LogNorm(vmin=1, vmax=1e9), annot=True, cmap = custom_color_map, linewidths=0.05, linecolor='grey')
    plt.tight_layout()
    plt.show() 
    

    enter image description here