I'm new to subgraph matching and I found that networkx
has a pretty convenient way to do that. However I'm not sure how to match subgraphs correctly on graphs with edge attributes.
For example, say I have the following graph that has edge attributes:
import networkx as nx
import networkx.algorithms.isomorphism as iso
G = nx.MultiDiGraph()
G.add_edge(1, 2, label="A")
G.add_edge(1, 2, label="D")
G.add_edge(2, 1, label="B")
G.add_edge(2, 3, label="C")
And I'm trying to find a subgraph with the following structure:
SG = nx.MultiDiGraph()
SG.add_edge(5, 6)
SG.add_edge(6, 7)
Note that this subgraph does not have edge attribute so I was trying to find the subgraphs with any edges that fits the node structure (in this case a simple path of length 2). I found these 2 SO questions (one, two) which directed me to try the following 2 approaches:
Approach 1
gm = iso.MultiDiGraphMatcher(G, SG, edge_match=iso.categorical_edge_match("label", None))
print(gm.subgraph_is_monomorphic())
for subgraph_x in gm.subgraph_monomorphisms_iter():
print(subgraph_x)
This prints:
True
{1: 5, 2: 6, 3: 7}
However, I assumed that this will print two different subgraphs since there are 2 different edges between the nodes 1 and 2. Meaning, the two subgraphs should be ((1,2) with edge label "A", (2,3) with edge label "C"), and ((1,2) with edge label "D", (2,3) with edge label "C")
. I also couldn't find an approach to output which edge labels correspond to the output nodes which would have helped understand which path this ouputs.
Approach 2
SG = nx.MultiDiGraph()
SG.add_edge(5, 6, label="A")
SG.add_edge(6, 7, label="C")
gm = iso.MultiDiGraphMatcher(G, SG, edge_match=lambda x, y: x[0]['label'] == y[0]['label'])
print(gm.subgraph_is_monomorphic())
for subgraph_x in gm.subgraph_monomorphisms_iter():
print(subgraph_x)
This prints
True
{1: 5, 2: 6, 3: 7}
Here I specified which edges I want in the subgraph and the output makes sense, since there is only 1 subgraph with that structure. For example, if I change SG.add_edge(6, 7, label="C")
to SG.add_edge(6, 7, label="D")
, this prints False
.
However, for my use case I need the first approach to work, since my input is a subgraph without edge attributes.
I also tried using a graph without edge attributes and changing between subgraph_isomorphisms_iter
and subgraph_monomorphisms_iter
:
G = nx.MultiDiGraph([(1,2), (1,3), (1,4), (1,4), (4,5), (5,6), (1,7), (7,8)])
subgraph = nx.MultiDiGraph([(10,20), (10,30), (10,40), (40,50)])
subgraphs = []
GM = iso.MultiDiGraphMatcher(G,subgraph,edge_match=iso.categorical_edge_match([],[]))
for subgraph_x in GM.subgraph_monomorphisms_iter():
subgraphs.append(subgraph_x)
subgraphs
using subgraph_monomorphisms_iter
this prints:
[{1: 10, 2: 20, 3: 30, 4: 40, 5: 50},
{1: 10, 2: 20, 3: 30, 7: 40, 8: 50},
{1: 10, 2: 20, 4: 30, 7: 40, 8: 50},
{1: 10, 2: 20, 7: 30, 4: 40, 5: 50},
{1: 10, 3: 20, 2: 30, 4: 40, 5: 50},
{1: 10, 3: 20, 2: 30, 7: 40, 8: 50},
{1: 10, 3: 20, 4: 30, 7: 40, 8: 50},
{1: 10, 3: 20, 7: 30, 4: 40, 5: 50},
{1: 10, 4: 20, 2: 30, 7: 40, 8: 50},
{1: 10, 4: 20, 3: 30, 7: 40, 8: 50},
{1: 10, 7: 20, 2: 30, 4: 40, 5: 50},
{1: 10, 7: 20, 3: 30, 4: 40, 5: 50}]
Which seems to be very redundant as there should only be 3 subgraphs, but even using subgraph_isomorphisms_iter
it seems to find 2 which are redundant:
subgraphs = []
GM = nx.algorithms.isomorphism.DiGraphMatcher(G,subgraph) #dictionary that maps nodes of main graph to nodes of subgraph.
for subgraph_x in GM.subgraph_isomorphisms_iter():
subgraphs.append(subgraph_x)
This prints
[{1: 10, 2: 20, 3: 30, 7: 40, 8: 50}, {1: 10, 3: 20, 2: 30, 7: 40, 8: 50}]
However, when I tried switching the first 2 approaches to subgraph_isomorphisms_iter
it didn't find any subgraphs. So i'm very confused.
Update I found that this SO question explains the difference between isomorphism and monomorphism in networkx well. And found the source code for the matcher if that helps. It also seems from here that the redundancy is a result of symmetries and if I understand correctly networkx doesn't take symmetries into account.
My approach:
Expand MultiDiGraph to DiGraph, perform isomorphism check on expanded DiGraph, this way you account for multiple edges as they are counted seperately:
import networkx as nx
import networkx.algorithms.isomorphism as iso
G = nx.MultiDiGraph()
G.add_edge(1, 2, label="A")
G.add_edge(1, 2, label="D")
G.add_edge(2, 1, label="B")
G.add_edge(2, 3, label="C")
SG = nx.MultiDiGraph()
SG.add_edge(5, 6)
SG.add_edge(6, 7)
def expand_graph(G):
'''
Expand the graph such that for each edge e from u to v:
e=(u,v) you create an edge specific vertex: e -> (u,e_)(e_,u)
This way your MultiDiGraph becomes a DiGraph
'''
H = nx.DiGraph()
for u in G.nodes():
key = 0
for e in G.out_edges(u, data=True):
H.add_edge(u, str(u)+'_out_'+str(key))
H.add_edge(str(u)+'_out_'+str(key), e[1])
key+=1
nx.set_node_attributes(H, {n:0 for n in H.nodes()}, 'original')
nx.set_node_attributes(H, {n:1 for n in G.nodes()}, 'original')
return H
def collapse_graph(G_):
'''
Turns a DiGraph generated with the expand_graph function back
to a MultiDiGraph, adds edge attribtute label to distinguish edges
'''
H = nx.MultiDiGraph()
for u, d in G_.nodes(data=True):
if d['original']==1:
for u, e_ in G_.out_edges(u):
for e_, v in G_.out_edges(e_):
H.add_edge(u, v, label = e_.split('_')[-1])
return H
G_ = expand_graph(G)
SG_ = expand_graph(SG)
subgraphs = []
GM = nx.algorithms.isomorphism.DiGraphMatcher(G_,SG_)
#Perform the check solely for isomorph subgraphs,
#note this includes also isomorph subgraphs on the set of vertices e_
for subgraph_x in GM.subgraph_isomorphisms_iter():
subgraphs.append(subgraph_x)
result=[]
subgraphs_valid = []
#This gets rid of subgraphs that are isomorph but not on the right set of nodes,
#i.e. we want to check whether the nodes are of mixed type (e_ mixed with u)
for s in subgraphs:
if all([True if type(u)== type(v) else False for u,v in s.items()]):
subgraphs_valid.append(s)
#Iterate over the isomorph subgraphs, extract the subgraphs and parse them as results
#Collapse them in order to get a edge representations with labels for the edges
for s in subgraphs_valid:
f = [k for k, v in s.items()]
F_ = G_.subgraph(f)
F = collapse_graph(F_)
result.append(list(F.edges(data=True)))
print(result)
I am sure there is an easier solution... But maybe the approach helps you.