pythonnumpymatplotlibaxeshistogram2d

How to apply logarithmic axis labels without log scaling image (matplotlib imshow)


I have a large data set that is logarithmic in distribution. I want to make a heat map, so I do a 2D histogram and pass that to implot. Because the data is logarithmic, I am passing the log of the data to the histogram. When I make the plot, however, I want the axis to be restored (ie 10^hist bin values) and log axes. If I set the axis to log style, then the image looks all skewed. The data is already 'logged' from when I passed it to the histogram, so I don't want the image affected, just the axis. So, in the below example, I want the image on the left with the axis on the right.

I guess I could do it with a fake overlayed axis, but I don't like to do that sort of thing if there's a better way...

enter image description here

import numpy as np
import matplotlib.pyplot as plt

x=10**np.random.random(10000)*5
y=10**np.random.random(10000)*5

samps, xedges, yedges = np.histogram2d(np.log10(y), np.log10(x),     bins=50)    

ax = plt.subplot(121)

plt.imshow(samps, extent=[0,5,0,5])
plt.xlabel('Log10 X')
plt.ylabel('Log10 Y')

ax = plt.subplot(122)    
plt.imshow(samps, extent=[10**0,10**5,10**0,10**5])
plt.xlabel('X')
plt.ylabel('Y')
plt.xscale('log')
plt.yscale('log')
plt.show()

Solution

  • You need to use a custom formatter. Here's an example from the matplotlib docs: https://matplotlib.org/examples/pylab_examples/custom_ticker1.html

    I tend to use FuncFormatter as the example does. The main trick is that your function need to take to arguments x and pos. I honestly don't know what pos is for. Perhaps no even intentionally, but you can use FuncFormatter as a decorator, which is what I do below:

    %matplotlib inline
    import numpy as np
    import matplotlib.pyplot as plt
    
    @plt.FuncFormatter
    def fake_log(x, pos):
        'The two args are the value and tick position'
        return r'$10^{%d}$' % (x)
    
    x=10**np.random.random(10000)*5
    y=10**np.random.random(10000)*5
    
    samps, xedges, yedges = np.histogram2d(np.log10(y), np.log10(x), bins=50)    
    
    fig, (ax1) = plt.subplots()
    ax1.imshow(samps, extent=[0, 5, 0, 5])
    ax1.xaxis.set_major_formatter(fake_log)
    ax1.yaxis.set_major_formatter(fake_log)
    ax1.set_xlabel('X')
    ax1.set_ylabel('Y')
    

    enter image description here