algorithmdata-structuresgraphcomputer-sciencekosaraju-sharir

Order of steps in Kosaraju's algorithm


The Wikipedia summary of the Kosaraju-Sharir algorithm is as follows:

Let G be a directed graph and S be an empty stack.

  • While S does not contain all vertices.
    • Choose an arbitrary vertex v not in S.
    • Perform a depth-first search starting at v. Each time that depth-first search finishes expanding a vertex u, push u onto S.
  • Reverse the directions of all arcs to obtain the transpose graph.
  • While S is nonempty:
    • Pop the top vertex v from S.
    • Perform a depth-first search starting at v in the transpose graph.
    • The set of visited vertices will give the strongly connected component containing v; record this and remove all these vertices from the graph G and the stack S. Equivalently, breadth-first search (BFS) can be used instead of depth-first search.

But in my textbook - Sedgewick's Algorithms (fourth edition) - it describes the steps of the algorithm as follows:

  • Given a digraph G, compute the reverse post-order of its reverse digraph. GR
  • Run a standard DFS on G, but consider the unmarked vertices in the order just computed instead of the standard numerial order
  • The set of all vertices...

The conclusion drawn in the final step is identical, as are the operations performed in those preceding it - but it seems that those operations are given in different orders: Wikipedia tells me to start by doing a DFS on G and then transposing it, doing the second DFS on GR, whereas my textbook suggests that I begin with the transpose, do the first DFS on GR and the second on G.

My primary question is: Am I understanding this correctly, or am I misinterpreting what one or the other is saying?

Secondly: Intuitively, it seems as though these operations are transitive and therefore that these two "different methods" are in fact equivalent, and will always yield the same final result. I've tested this intuition on a couple of digraphs and it seems to hold true - but is it?

Thirdly: Assuming it is, is there any objective reason to prefer one over the other, or is it simply a matter of preference?


Solution

  • IMO the two algorithms are equivalent but there is a slight difference. The difference is in the order of SCCs (strongly connected components) output from them.

    Say we have an acyclic di-graph having SCCs as S1, S2, S3, S4 in that order.

    S1 -> S2 -> S3 -> S4
    

    Algorithm 1 (Wikipedia).

    When you build the stack S, for any vertex v, all the vertices coming after v shall enter the stack S before v, because we are carrying out DFS on the forward graph.

    Now we reverse the graph:

    R_S1 <- R_S2 <- R_S3 <- R_S4
    

    The first vertex to be popped from Stack S shall be in R_S1. Hence when you perform the DFS, the first SCC to be output shall be R_S1.

    Algorithm 2 (book).

    Here we first reverse the graph:

    R_S1 <- R_S2 <- R_S3 <- R_S4
    

    Now when we conduct the DFS on any vertex v, the vertices coming before v (in original graph) shall be ordered before v. Also, since we start the order_index from N and then decrement the order_index, all the vertices coming after v shall have a lower topological ordering than v.

    Original graph:

    S1 -> S2 -> S3 -> S4
    

    The lowest ordered vertex shall now be in S4. Hence the first SCC that will be output from the graph shall be S4 as opposed to R_S1 in the first case.