Taking the following graph:
G = nx.DiGraph()
G.add_edge(1,2)
G.add_edge(3,2)
G.add_edge(1,4)
G.add_edge(2,5)
which visualized gives:
I'd like to find the leaf-nodes which only have incoming-edges from within the "sub-graph" starting from 1
.
In my example it has to find 4
but not 5
. 5
is a child of 2
which has 3
as second input.
I think it should be something with successors and the in_degree but I'm such a newbie to NetworkX that it is quite hard to find the right algorithm.
Another example:
G = nx.DiGraph()
G.add_edge(1,2)
G.add_edge(3,2)
G.add_edge(1,4)
G.add_edge(2,5)
G.add_edge(4,6)
G.add_edge(1,7)
G.add_edge(7,6)
G.add_edge(1,8)
G.add_edge(8,7)
G.add_edge(8,6)
G.add_edge(4,9)
G.add_edge(4,10)
G.add_edge(5,10)
Here it should find 9
and 6
. But not 10
(because of 3
being parent of 2
being a parent 5
.
Let's try this bit of logic:
def find_leafnodes(G):
leafnode = []
for i in G.nodes:
head = []
if nx.descendants(G, i) == set(): #find all leaf nodes
for a in nx.ancestors(G, i): #get all ancestors for leaf node
if nx.ancestors(G, a) == set(): #Determine if ancestor is a head node
head.append(a)
if len(head) == 1: #if this leaf had only one head then append to leafnode
leafnode.append(i)
return leafnode
Input:
G = nx.DiGraph()
G.add_edge(1,2)
G.add_edge(3,2)
G.add_edge(1,4)
G.add_edge(2,5)
find_leafnodes(G)
# [4]
G = nx.DiGraph()
G.add_edge(1,2)
G.add_edge(3,2)
G.add_edge(1,4)
G.add_edge(2,5)
G.add_edge(4,6)
G.add_edge(1,7)
G.add_edge(7,6)
G.add_edge(1,8)
G.add_edge(8,7)
G.add_edge(8,6)
G.add_edge(4,9)
G.add_edge(4,10)
G.add_edge(5,10)
find_leafnodes(G)
# [6, 9]