Does the awkward library provide a way to slice out all attributes of a given name, regardless of the level? I was thinking something like this:
import awkward as ak
obj = {
'resource_id': 'abc',
'events': [
{'resource_id': '123', 'value': 12, 'picks':
[{'resource_id': 'asd', 'value': 1},
{'resource_id': 'dll', 'value': 12}
]
},
{'resource_id': '456', 'value': 12, 'picks':
[{'resource_id': 'cvf', 'value': 23},
{'resource_id': 'ggf', 'value': 34},
]
},
]
}
ar = ak.from_iter(obj)
rid = ar[..., 'resource_id']
The value of rid
is simply the string 'abc' but I was expecting something more like the following:
[
['abc'],
['events':[
[['123'], 'picks':[['asd'], ['dll']]],
[['456'], 'picks':[['cvf'], ['ggf']]],
]
]
However, I am still trying to get my head around awkward arrays so I could be completely off here.
It doesn't, and I'm not sure how the output of such an operation should be shaped. For instance, if you pick the outer "resource_id"
, you get
>>> ar["events", "resource_id"]
<Array ['123', '456'] type='2 * string'>
but if you pick the inner "resource_id"
, you get
>>> ar["events", "picks", "resource_id"]
<Array [['asd', 'dll'], ['cvf', 'ggf']] type='2 * var * string'>
Note that the ...
does have a meaning, but it slices through rows (nested lists), not columns (record field names).
>>> ar["events", "picks", "value"]
<Array [[1, 12], [23, 34]] type='2 * var * int64'>
>>> ar["events", "picks", "value", ..., 0]
<Array [1, 23] type='2 * int64'>
Also, it might help to know that you can project with strings and lists of strings (nested projection):
>>> print(ar["events", "picks", ["resource_id", "value"]])
[[{resource_id: 'asd', value: 1}, ... {resource_id: 'ggf', value: 34}]]
in case that helps with your slicing problem (which will likely be manually picking out "resource_id"
at all levels and putting them together in a way that makes sense for your data, but maybe can't be generalized).