I have a set of N chemical compounds enumerated 1, 2,..., N. For each compound, I have the fraction of each of its constituents, "A", "B", and so on. Compounds can also contain other compounds, in which case the corresponding fraction is given. For instance, for N = 5, a sample set is
mixes = {
1: {
"A": 0.32,
"B": 0.12,
"C": 0.15,
2: 0.41
},
2: {
"C": 0.23,
"D": 0.12,
"E": 0.51,
4: 0.14
},
3: {
"A": 0.24,
"E": 0.76
},
4: {
"B": 0.13,
"F": 0.01,
"H": 0.86
},
5: {
"G": 0.1,
2: 0.4,
3: 0.5
}
}
I would like an algorithm that gives the net fraction of each constituent in every compoound, i.e.
mixes = {
1: {
"A": 0.32,
"B": 0.12 + 0.41 * 0.14 * 0.13,
"C": 0.15 + 0.41 * 0.23,
"D": 0.41 * 0.12,
"E": 0.41 * 0.51,
"F": 0.41 * 0.14 * 0.01,
"H": 0.41 * 0.14 * 0.86
},
2: {
"B": 0.14 * 0.13,
"C": 0.23,
"D": 0.12,
"E": 0.51,
"F": 0.14 * 0.01,
"H": 0.14 * 0.86
},
3: {
"A": 0.24,
"E": 0.76
},
4: {
"B": 0.13,
"F": 0.01,
"H": 0.86
},
5: {
"A": 0.5 * 0.24,
"G": 0.1,
"B": 0.4 * 0.14 * 0.13,
"C": 0.4 * 0.23,
"D": 0.4 * 0.12,
"E": 0.4 * 0.51 + 0.5 * 0.76,
"F": 0.4 * 0.14 * 0.01,
"H": 0.4 * 0.14 * 0.86
}
}
My current approach involves recursion, but I´d like to know if there´s a clever way to do this. Perhaps using a tree-like data structure may help?
EDIT: for simplicity, assume that there´s no cyclic relationships in the dataset.
Here is a recursive solution. I noted the original mix ratios for each compound add to 1 so there is also a check that the result compounds also add to 1:
from pprint import pprint
from collections import defaultdict
mixes = {
1: {
'A': 0.32,
'B': 0.12,
'C': 0.15,
2: 0.41
},
2: {
'C': 0.23,
'D': 0.12,
'E': 0.51,
4: 0.14
},
3: {
'A': 0.24,
'E': 0.76
},
4: {
'B': 0.13,
'F': 0.01,
'H': 0.86
},
5: {
'G': 0.1,
2: 0.4,
3: 0.5
}
}
def resolve(compound):
'''Return the components of a compound, recursively adjusted for ratios.
'''
for component, ratio in mixes[compound].items():
# If the component is another compound, recursively report its ratios.
if isinstance(component, int):
for subcomponent, subratio in resolve(component):
yield subcomponent, subratio * ratio
else:
yield component, ratio
# A dictionary of dictionaries with default 0.0 float values.
result = defaultdict(lambda: defaultdict(float))
# Resolve each compound and add up component ratios
for compound in mixes:
for component, ratio in resolve(compound):
result[compound][component] += ratio
pprint(result, width=1)
# Checking ratios of result compounds add up to 1
for compound, components in result.items():
print(compound, sum(component for component in components.values()))
Output:
defaultdict(<function <lambda> at 0x00000214CD56FB00>,
{1: defaultdict(<class 'float'>,
{'A': 0.32,
'B': 0.127462,
'C': 0.2443,
'D': 0.049199999999999994,
'E': 0.20909999999999998,
'F': 0.0005740000000000001,
'H': 0.049364}),
2: defaultdict(<class 'float'>,
{'B': 0.0182,
'C': 0.23,
'D': 0.12,
'E': 0.51,
'F': 0.0014000000000000002,
'H': 0.12040000000000001}),
3: defaultdict(<class 'float'>,
{'A': 0.24,
'E': 0.76}),
4: defaultdict(<class 'float'>,
{'B': 0.13,
'F': 0.01,
'H': 0.86}),
5: defaultdict(<class 'float'>,
{'A': 0.12,
'B': 0.007280000000000001,
'C': 0.09200000000000001,
'D': 0.048,
'E': 0.5840000000000001,
'F': 0.0005600000000000001,
'G': 0.1,
'H': 0.04816000000000001})})
1 1.0
2 1.0
3 1.0
4 1.0
5 1.0