uproot

uproot: best way to load and replot a TH2 histogram from a .root file on a jupyter notebook


I am very new to python and uproot. Previously, I have been using ROOT in a C++ environment. Following the uproot tutorial, I can read my TH2D graphs from a .root file

I want now to recreate and replot the existing graph through matplotlib or seaborn, but I don't get the structure of the imported TH2. myTH2D._members() outputs correctly:

['fName',
 'fTitle',
 'fLineColor',
 'fLineStyle',
 'fLineWidth',
 'fFillColor',
 'fFillStyle',
 'fMarkerColor',
 'fMarkerStyle',
 'fMarkerSize',
 'fNcells',
 'fXaxis',
 'fYaxis',
 'fZaxis',
 'fBarOffset',
 'fBarWidth',
 'fEntries',
 'fTsumw',
 'fTsumw2',
 'fTsumwx',
 'fTsumwx2',
 'fMaximum',
 'fMinimum',
 'fNormFactor',
 'fContour',
 'fSumw2',
 'fOption',
 'fFunctions',
 'fBufferSize',
 'fBuffer',
 'fBinStatErrOpt',
 'fScalefactor',
 'fTsumwy',
 'fTsumwy2',
 'fTsumwxy']

myTH2D.edges outputs the right axis, myTH2D.values outputs the right counts (confirmed with a rough plt.imshow(myTH2D.values). The problems start when I call myTH2D.pandas()

count   variance
tof1 [ns]   tof2 [ns]       
[-inf, 4500.0)  [-inf, 4500.0)  0.0 0.0
[4500.0, 4507.142857142857) 0.0 0.0
[4507.142857142857, 4514.285714285715)  0.0 0.0
[4514.285714285715, 4521.428571428572)  0.0 0.0
[4521.428571428572, 4528.571428571428)  0.0 0.0
... ... ... ...
[7500.0, inf)   [6971.428571428572, 6978.571428571429)  0.0 0.0
[6978.571428571429, 6985.714285714286)  0.0 0.0
[6985.714285714286, 6992.857142857143)  0.0 0.0
[6992.857142857143, 7000.0) 0.0 0.0
[7000.0, inf)   0.0 0.0
123904 rows × 2 columns

and the ntuple that is created with myTH2D.numpy() is nested in a way that I don't understand:

(array([[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]]),
 [(array([4500.        , 4508.57142857, 4517.14285714, 4525.71428571,
          4534.28571429, 4542.85714286, 4551.42857143, 4560.        ,
          ...,
          7414.28571429, 7422.85714286, 7431.42857143, 7440.        ,
          7448.57142857, 7457.14285714, 7465.71428571, 7474.28571429,
          7482.85714286, 7491.42857143, 7500.        ]),
   array([4500.        , 4507.14285714, 4514.28571429, 4521.42857143,
          4528.57142857, 4535.71428571, 4542.85714286, 4550.        ,
          ...,
          6957.14285714, 6964.28571429, 6971.42857143, 6978.57142857,
          6985.71428571, 6992.85714286, 7000.        ]))])

Do you have any suggestion on how to handle these ntuple?

Thank you!

EDIT:

with the following syntax, I can almost achieve the right plot. It is flipped compared to the original:

plt.pcolormesh(myTH2D[1][0][0],myTH2D[1][0][1],myTH2D[0])

sx: original ROOT TH2D; dx: plt.pcolormesh numpy

Nevertheless, my problem is still there: I'd like to have the data processed through pandas, having therefore the labels: now I don't know which is x- and which is y-axis. Any ideas?


Solution

  • From the arrays of edges and bin counts (myTH2D.numpy()), you could use any of these techniques to plot it in Matplotlib:

    Python: Creating a 2D histogram from a numpy matrix

    You mentioned Seaborn, but I'm less familiar with that. Surely it has similar functions.

    On the bleeding edge, you could instead install uproot4 and hist>=2.0.0 (to get the hist prerelease), and then just

    myTH2D.to_hist().plot()
    

    The hist library aims to be a one-stop-shop for histogramming, and it's close to its first non-pre release. (The series starts at 2.0.0 because it took over the name of a no-longer-updated project. "hist" is to general of a name to lose!)

    The Uproot 4 codebase is almost ready to replace the current Uproot; it needs documentation and file-writing capabilities. The interface is slightly different to address issues with Uproot 3's interface (e.g. strings vs bytestrings), so that's why this is being handled as a gradual transition with a temporarily different library name, rather than changing all at once. But if you're just starting out, you might want to start with the new library, so that you don't have to get used to a change in the near future (this fall).