I have a simple adjacency list representation of a graph like this
{
1: [2, 3, 4],
2: [5],
3: [6, 9],
4: [3],
5: [3],
6: [7, 8],
7: [],
8: [],
9: []
}
which looks like this
But each node also has a "type ancestry" which tells us what "type" of node it is, and this is completely different from the hierarchy present in the adjacency list above. For example, it would tell us that "node 6 is of type 'BA', and all type 'BA' nodes are type 'B'".
For example, consider this ancestry:
{
1: [], # no ancestry
2: ['AA', 'A'], # Read this as "2 is a type AA node, and all type AA nodes are type A nodes"
3: ['B'], # 3 is directly under type B
4: [],
5: ['AA', 'A'],
6: ['BA', 'B'],
7: ['BA', 'B'],
8: ['BA', 'B'],
9: ['BB', 'B']
}
which when visualized would look like this
However, instead of connecting nodes directly as per the adjacency list, we must use their "representative types" when available, where the representative types of the nodes would be the nodes just below the lowest common ancestor of their type ancestries. When visualized with this adjustment, it would look like this
So what I want to produce programmatically is a "hierarchical/nested" adjacency list for such a visualization, which would look like below. The main idea is to introduce a subtree
for each key in the adjacency list (along with the edges
field), which would in turn contain its own adjacency list and so forth (recursively).
{1: {'edges': ['A', 'B', 4], 'subgraphs': {}},
4: {'edges': ['B'], 'subgraphs': {}},
'A': {'edges': ['B'],
'subgraphs': {'AA': {'edges': [],
'subgraphs': {2: {'edges': [5], 'subgraphs': {}},
5: {'edges': [], 'subgraphs': {}}}}}},
'B': {'edges': [],
'subgraphs': {3: {'edges': ['BA', 'BB'], 'subgraphs': {}},
'BA': {'edges': [],
'subgraphs': {6: {'edges': [7, 8], 'subgraphs': {}},
7: {'edges': [], 'subgraphs': {}},
8: {'edges': [], 'subgraphs': {}}}},
'BB': {'edges': [],
'subgraphs': {9: {'edges': [], 'subgraphs': {}}}}}}}
What is an elegant way of transforming the original adjacency list + the separate "ancestry" map to produce such a data structure?
Basing this off of @trincot's answer (who wanted me to write my own answer instead of editing theirs as the change is non-trivial) with a critical change on how edges are made.
Create a node (a dict) for each distinct node that can be found in ancestry list, so including the ancestry groups. A node looks like this:
{ 'edges': [], 'subgraphs': {} }
These objects are collected in a dict, keyed by their name (e.g. key could be 1 or 'AA', ...)
Populate the 'edges' attribute based on the adjacency input list. Make sure to use the ancestry reference instead when the edge transitions to another ancestry group. Here is where the LCA approach comes in.
Populate the 'subgraphs' attribute based on the ancestry input list. While doing that, keep track of the nodes that are not a child of another node: these should be retained for the root node in the data structure.
def transform(adjacency, ancestry):
# utility function to get the element in the list before the target, if one is present
def get_element_before(lst, target):
try:
index = lst.index(target)
return lst[index - 1] if index - 1 >= 0 else None
except ValueError:
return None
# Finds the lowest common ancestor between 2 paths, assuming that the paths
# are ordered from bottom to top
def find_lca(path1, path2):
lca = None
for a, b in zip(path1[::-1], path2[::-1]):
if a == b:
lca = a
else:
break
return lca
# Given two nodes, it adds a link between the correct pair of "representative" nodes
# in the newly constructed nested graph
def add_link(node1, node2, nodes):
ancestry1, ancestry2 = ancestry[node1], ancestry[node2]
# Special cases when the LCA cannot be found
if not ancestry1 and not ancestry2:
nodes[node1]['edges'].append(node2)
elif not ancestry1:
nodes[node1]['edges'].append(ancestry2[-1])
elif not ancestry2:
nodes[ancestry1[-1]]['edges'].append(node2)
else:
# When LCA is likely to be present
lca = find_lca(ancestry1, ancestry2)
if not lca:
# This can happen if the 2 nodes have completely disjoint hierarchy paths
nodes[ancestry1[-1]]['edges'].append(ancestry2[-1])
return
# The node just below the LCA in each node serves as the "representative" node of that node in the newly built graph
representative_node1 = get_element_before(ancestry1, lca)
representative_node2 = get_element_before(ancestry2, lca)
# If the two nodes are in the same subtree at the same level, they
# will act as their own representative nodes
representative_node1 = node1 if representative_node1 is None else representative_node1
representative_node2 = node2 if representative_node2 is None else representative_node2
nodes[representative_node1]['edges'].append(representative_node2)
# Create the basic object (dict) for each node:
nodes = {
subgraph: { 'edges': [], 'subgraphs': {} }
for node, subgraphs in ancestry.items()
for subgraph in (node, *subgraphs)
}
# populate the "edges" attributes between basic nodes (or their "representative" nodes)
for node, children in adjacency.items():
for child in children:
add_link(node, child, nodes)
# keep track of the nodes that are to stay at the root level
root = dict(nodes)
# populate the "subgraphs" attributes
for node, ancestors in ancestry.items():
for child, parent in zip((node, *ancestors), ancestors):
nodes[parent]['subgraphs'][child] = nodes[child]
root.pop(child, None)
return root
Testing it
adj_list = {
1: [2, 3, 4],
2: [5],
3: [6, 9],
4: [3],
5: [3],
6: [7, 8],
7: [],
8: [],
9: []
}
ancestry = {
1: [],
2: ['AA', 'A'],
3: ['B'],
4: [],
5: ['AA', 'A'],
6: ['BA', 'B'],
7: ['BA', 'B'],
8: ['BA', 'B'],
9: ['BB', 'B']
}
pprint.pprint(transform(adj_list, ancestry))
produces
{1: {'edges': ['A', 'B', 4], 'subgraphs': {}},
4: {'edges': ['B'], 'subgraphs': {}},
'A': {'edges': ['B'],
'subgraphs': {'AA': {'edges': [],
'subgraphs': {2: {'edges': [5], 'subgraphs': {}},
5: {'edges': [], 'subgraphs': {}}}}}},
'B': {'edges': [],
'subgraphs': {3: {'edges': ['BA', 'BB'], 'subgraphs': {}},
'BA': {'edges': [],
'subgraphs': {6: {'edges': [7, 8], 'subgraphs': {}},
7: {'edges': [], 'subgraphs': {}},
8: {'edges': [], 'subgraphs': {}}}},
'BB': {'edges': [],
'subgraphs': {9: {'edges': [], 'subgraphs': {}}}}}}}