This question is related to but not the same as the one recently asked here.
I just read the Wikipedia psuedocode.
algorithm tarjan is
input: graph G = (V, E)
output: set of strongly connected components (sets of vertices)
index := 0
S := empty
for each v in V do
if (v.index is undefined) then
strongconnect(v)
end if
end for
function strongconnect(v)
// Set the depth index for v to the smallest unused index
v.index := index
v.lowlink := index
index := index + 1
S.push(v)
// Consider successors of v
for each (v, w) in E do
if (w.index is undefined) then
// Successor w has not yet been visited; recurse on it
strongconnect(w)
v.lowlink := min(v.lowlink, w.lowlink)
else if (w is in S) then
// Successor w is in stack S and hence in the current SCC
v.lowlink := min(v.lowlink, w.index)
end if
end for
// If v is a root node, pop the stack and generate an SCC
if (v.lowlink = v.index) then
start a new strongly connected component
repeat
w := S.pop()
add w to current strongly connected component
until (w = v)
output the current strongly connected component
end if
end function
I obviously mustn't have understood it properly since I have two very basic questions:
When we say if (w is in S)
, then isn't this operation of O(N) or atleast O(logN) complexity because the elements shall be ordered by their indices? We will have to perform this for each new node accessible from the root node, so isn't the overall complexity O(NlogN)
. Moreover S is a stack so conceptually only the top element should be accessible, how do we implement search in it? Shouldn't a binary search tree be a better data structure here?
In this part:
else if (w is in S) then
v.lowlink := min(v.lowlink, w.index)
Is there a specific reason for using w.index
and not w.lowlink
? The advantage for using w.lowlink
would be that it would answer the previous question (the linked question). The LLs
for all the nodes in an SCC would guaranteed be the same for all nodes.
1) Your first question: It can easily been done in O(1) , just maintain a boolean array inStack
, the moment node n
is put in the stack, flag inStack[n]
to true. When you pop it off the stack, flag it back to false.
2) Not much different between w.index
and w.lowlink
, but this is easier to read, as we will understand that this condition is to check for the case from Node A ->B ->C ->A, which checking when node C can reach a predecessor node A or not. Keep in mind that at the moment we updating C, node A lowlink is not updated properly yet.
The Tarjan algorithm is based on the fact that a node will be the root of a SCC if and only if from that node, we cannot reach any predecessor node (which means it has the lowest lowlink in its SCC and also equals to this node's index). So the condition is just implemented this idea in the most straight forward way, if we encounter a node that is already visited, we check whether this node is a predecessor of the current node or not (which is determined by its index , also the level of this node in the graph)