I wish to sum all the 4-momenta of the constituents in a jet. In uproot3 (+ uproot3-methods) there was the functionality of creating a TLorentzVectorArray and just doing .sum()
So this worked fine:
import uproot3
import akward0 as ak
input_file = uproot3.open(input_path)
tree = input_file['Jets']
pt = tree.array('Constituent_pt')
phi = tree.array('Constituent_phi')
eta = tree.array('Constituent_eta')
energy = tree.array('Constituent_energy')
mass = tree.array('Constituent_mass')
p4 = uproot3_methods.TLorentzVectorArray.from_ptetaphie(pt, eta, phi, energy)
jet_p4_u3 = p4.sum()
jet_pt_u3 = jet_p4.pt
jet_eta_u3 = jet_p4.eta
jet_phi_u3 = jet_p4.phi
jet_energy_u3 = jet_p4.energy
However, since uproot3 is deprecated, the way to go according to TLorentz vector in Uproot 4 seems to be the vector package. What I tried was the following.
import uproot
import awkward
import vector
input_file = uproot.open(input_path)
tree = input_file['Jets']
pt = tree.arrays()['Constituent_pt']
phi = tree.arrays()['Constituent_phi']
eta = tree.arrays()['Constituent_eta']
energy = tree.arrays()['Constituent_energy']
mass = tree.arrays()['Constituent_mass']
p4 = vector.awk({"pt": pt, "phi": phi, "eta": eta, "energy": energy})
The problem now is that this functionality p4.sum()
seems to not exist there. The other possibility that I found was shown in the vector discussion #117. So, now I add after the imports vector.register_awkward()
and to the end jet_p4_u4 = ak.Array(p4, with_name="Momentum4D")
,
import uproot
import awkward
import vector
vector.register_awkward()
input_file = uproot.open(input_path)
tree = input_file['Jets']
pt = tree.arrays()['Constituent_pt']
phi = tree.arrays()['Constituent_phi']
eta = tree.arrays()['Constituent_eta']
energy = tree.arrays()['Constituent_energy']
mass = tree.arrays()['Constituent_mass']
p4 = ak.Array({"pt": pt, "phi": phi, "eta": eta, "energy": energy})
jet_p4_u4 = ak.Array(p4, with_name="Momentum4D")
The question remains, how do I sum the 4-momenta? When doing ak.sum(jet_p4_u4, axis=-1), only pt and energy seem to have the correct values, eta and phi however are completely different from the result from uproot3.
Update: It seems that since the ```ak.sum`` function is not able to add together the angles in the wanted way, then replacing the summing part with summing x, y, z and energy and constructing the vector like this solves the problem. However, I believe there must be a better way than this. So current working version:
import uproot
import awkward
import vector
input_file = uproot.open(input_path)
tree = input_file['Jets']
pt = tree.arrays()['Constituent_pt']
phi = tree.arrays()['Constituent_phi']
eta = tree.arrays()['Constituent_eta']
energy = tree.arrays()['Constituent_energy']
mass = tree.arrays()['Constituent_mass']
p4 = vector.awk({"pt": pt, "phi": phi, "eta": eta, "energy": energy})
p4_lz = vector.awk({"x": p4.x, "y": p4.y, "z": p4.z, "t": energy})
lz_sum = ak.sum(p4_lz, axis=-1)
jet_p4 = vector.awk({
"x": lz_sum.x,
"y": lz_sum.y,
"z": lz_sum.z,
"t": lz_sum.t
})
jet_energy = jet_p4.t
jet_mass = jet_p4.tau
jet_phi = jet_p4.phi
jet_pt = jet_p4.rho
For a solution that works equally well for flat arrays of Lorentz vectors as for jagged arrays of Lorentz vectors, try this:
import uproot
import awkward as ak
import vector
vector.register_awkward() # any record named "Momentum4D" will be Lorentz
with uproot.open(input_path) as input_file:
tree = input_file["Jets"]
arrays = tree.arrays(filter_name="Constituent_*")
p4 = ak.zip({
"pt": arrays.Constituent_pt,
"phi": arrays.Constituent_phi,
"eta": arrays.Constituent_eta,
"energy": arrays.Constituent_energy,
}, with_name="Momentum4D")
jet_p4 = ak.zip({
"px": ak.sum(p4.px, axis=-1),
"py": ak.sum(p4.py, axis=-1),
"pz": ak.sum(p4.pz, axis=-1),
"energy": ak.sum(p4.energy, axis=-1)
}, with_name="Momentum4D")
Note that the uproot.TTree.arrays function, if given no arguments, will read all TBranches in the TTree. In your function, you read all the data four times, each time selecting a different column from the data that had been read and throwing the rest out.
Also, I don't like the vector.awk
function because it can construct arrays of type:
N * Momentum4D[px: var * float64, py: var * float64, pz: var * float64, E: var * float64]
(in other words, each "px" value is a list of floats), rather than what you want:
N * var * Momentum4D[px: float64, py: float64, pz: float64, E: float64]
ak.zip combines the lists so that the "px" of each Lorentz vector is just a number, but you can have nested lists of Lorentz vectors. This only makes a difference if you have jagged arrays, but I'm pointing it out so that no one falls into this trap.
The with_name="Momentum4D"
argument labels the records with that name, and having Lorentz-vector behaviors registered with vector.register_awkward()
gives all such records Lorentz vector methods. In this case, we're using it so that p4
, defined in terms of pt
, phi
, eta
, energy
, has properties px
, py
, pz
—in other words, doing coordinate transformations on demand.
There isn't a Lorentz vector summation method that sums over each item in a jagged array (the uproot-methods one was a hack that only works for jagged arrays of Lorentz vectors, no other structures, like jagged-jagged, etc.), so sum the components with ak.sum in Cartesian coordinates.