I have a large nested dictionary with jagged list 'c' :
x = {'first_block':
{'unit1': {'a': (3,5,4), 'b': 23, 'c': [10]},
'unit2': {'a': (5,8,7), 'b': 15, 'c': [20,10]},
'unit10k': {'a': (2,4,9), 'b': 10, 'c': [6,10,20,5]}},
'second_block':
{'unit1' : {'a': (8,20,14), 'b': 10, 'c': [17,12,9]},
'unit2' : {'a': (9,25,50), 'b': 15, 'c': [17,15,9,4,12]},
'unit12k': {'a': (12,24,9), 'b': 23, 'c': [12,22,15,4]}},
'millionth_block':
{'unit1': {'a': (35,64,85), 'b': 64, 'c': [50]},
'unit2': {'a': (56,23,34), 'b': 55, 'c': [89,59,77]},
'unit5k': {'a': (90,28,12), 'b': 85, 'c': [48,90,27,59]}}}
The elements of 'c' are point labels.
For every unique point label in 'c' I want to produce a filtered list of the corresponding value in 'b',
so for example 'first_block' has unique elements in 'c' of: 5, 6, 10, 20
and i want to obtain/extract the following lists for each 'block', to list each value of 'b' associated with a specific value in 'c' e.g.
first_block:
5: [10]
6: [10]
10: [10,15,23]
20: [10,15]
second_block:
4: [15,23]
9: [10,15]
12: [10,15,23]
15: [15,23]
17: [10,15]
22: [23]
etc.
Any thoughts on how to create this outcome given that 'c' is jagged?
Have been trying to do this by converting to Awkward arrays but documentation is currently sparse, and really don't understand how to do this in Awkward.
Also open to pythonic suggestions which don't involve Awkward
Try this, it reproduces exactly what you want (including sorting)
x = {'first_block':
{'unit1': {'a': (3,5,4), 'b': 23, 'c': [10]},
'unit2': {'a': (5,8,7), 'b': 15, 'c': [20,10]},
'unit10k': {'a': (2,4,9), 'b': 10, 'c': [6,10,20,5]}},
'second_block':
{'unit1' : {'a': (8,20,14), 'b': 10, 'c': [17,12,9]},
'unit2' : {'a': (9,25,50), 'b': 15, 'c': [17,15,9,4,12]},
'unit12k': {'a': (12,24,9), 'b': 23, 'c': [12,22,15,4]}},
'millionth_block':
{'unit1': {'a': (35,64,85), 'b': 64, 'c': [50]},
'unit2': {'a': (56,23,34), 'b': 55, 'c': [89,59,77]},
'unit5k': {'a': (90,28,12), 'b': 85, 'c': [48,90,27,59]}}}
results = {}
for key in x.keys(): # Block level key
results[key] = {}
for unit in x[key].keys(): # Unit level key in subdict
for value in x[key][unit]['c']: #List of values in c
if value not in results[key].keys():
#You assign a c level key, create a list
results[key][value] = []
#And append values from b
results[key][value].append(x[key][unit]['b'])
#You sort your dict by key/item
results[key] = dict(sorted(results[key].items()))
for key in results:
print (key)
for value in results[key].keys():
print (value,results[key][value])
Output:
first_block
5 [10]
6 [10]
10 [23, 15, 10]
20 [15, 10]
second_block
4 [15, 23]
9 [10, 15]
12 [10, 15, 23]
15 [15, 23]
17 [10, 15]
22 [23]
millionth_block
27 [85]
48 [85]
50 [64]
59 [55, 85]
77 [55]
89 [55]
90 [85]