pythonpython-3.xnumpydatagridhistogram2d

NumPy histogram2d with a rotated, non-orthogonal binning grid


I need to compute (and plot) a histogram2d but my binning grid is rotated and also non-orthogonal.
A way of doing this could be to apply a transformation to my data so I get it into a cartesian system, compute my histogram2d and then apply the inverse transformation.
Can this be done directly without this overhead transformation ?

I guess my question is: how do I define the bins for my histogram2d in this case ? (AFAIK, histogram2d will only accept x and y aligned bins)

My data is 2 huge lists of points (10k~100k each), the coordinates of which are given in a cartesian coordinate system (actually a projected CRS because these are real-world locations) but they are organized in a regular grid that is not aligned to X and Y axis (rotated) and that may or may not be orthogonal. The binning grid will be derived from it so it will be a (rotated) regular quadrilaterals grid.

I have seen that matplotlib has a QuadMesh object (see here) so I'm being hopeful but I'm not sure how to handle this in NumPy.

Basically this is what I want to achieve:

enter image description here


Solution

  • After some testing, I came to the conclusion that the overhead of transforming the coordinates into a Cartesian grid to compute the histogram and back for plotting is acceptable. Matrix operations in NumPy are fairly efficient and I can handle 115+ million points in less than 7 sec.

    However, the "back" part can be handled by Matplotlib directly with matplotlib.transforms.
    pcolormesh, hist2d and imshow all accept a transform keyword which can be used to plot the Cartesian data into the desired coordinates like so:

    # set I, J, bins (in the Cartesian system) and cmap
    # a, b, c, d, e, f are values of the transformation matrix
    transform = matplotlib.transforms.Affine2D.from_values(a, b, c, f, d, e, f)
    fig, ax = plt.subplots(figsize=figsize)
    _, _, _, im = ax.hist2d(I, J, bins=bins, cmap=cmap, transform=transform + ax.transData)
    fig.colorbar(im)
    ax.autoscale()
    

    It is not really much faster than handling the "back" conversion with NumPy but it can make the code lighter as it only requires 1 additional line and 1 additional keyword.
    imshow can be a little bit of a pain as it won't update the display extent after using ax.autoscale() and it handles coordinates as images or matrix so the transform has to be adjusted accordingly. For these reasons, I prefer hist2d.

    References:

    1. https://matplotlib.org/3.1.1/api/transformations.html#module-matplotlib.transforms
    2. https://matplotlib.org/3.1.1/tutorials/advanced/transforms_tutorial.html