rubytarjans-algorithm

Bug in my Ruby version of Tarjan's algorithm


http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm

http://en.algoritmy.net/article/44220/Tarjans-algorithm

I can't figure out this bug in my Ruby version of Tarjan's algorithm for strongly connected components. I got the Kosaraju–Sharir algorithm to work, and my Tarjan's algorithm in Ruby works for some graphs. But it does not connect 2 components that should be connected---"10" and "11, 12, 9"

Input file is this directed graph: http://algs4.cs.princeton.edu/42directed/tinyDG.txt

expected: [["1"], ["0", "2", "3", "4", "5"], ["10", "11", "12", "9"], ["6", "8"], ["7"]]
got: [["1"], ["0", "2", "3", "4", "5"], ["10"], ["11", "12", "9"], ["6", "8"], ["7"]]

During this final loop, which tries to make a single component, it starts with "10" (last item on stack), but then the current vertex ("parent") also is "10"! This makes the loop cut off "10" as a separate component. Why is the latest item on the stack the same as the parent node? I'd expect "10" only to show up at the END of the component, after we've collected ["12", "11", "9"...then "10"]. Because "10" shows up first, instead of last, we have this problem. How do I fix it?

  begin
    last_stack_item = stack.pop
    component << last_stack_item.name
  end while last_stack_item != parent # we're back at the root

My Ruby code:

    # Tarjan's algorithm to find all strongly connected components (SCCs)
    def scc_tarjan
      index = 0 # numbers nodes consecutively in the order discovered
      stack, scc, vertices = [], [], []

      # create new Struct, if not already defined
      if Struct::const_defined?("TarjanVertex")
        Struct.const_get("TarjanVertex")
      else
        Struct.new("TarjanVertex", :name, :index, :lowlink)
      end

      adj_lists.each do |v|
        # -1 means vertex is unvisited
        vertex = Struct::TarjanVertex.new(v.name, -1, -1)
        vertices << vertex  # array of all TarjanVertex objects in graph
      end
      vertices.each do |vertex|
        tarjan_dfs(vertex, scc, stack, index, vertices) if vertex.index == -1
      end
      # return nested array of all SCCs in graph
      scc
    end

  def tarjan_dfs(parent, scc, stack, index, vertices)
    # Set depth index for vertex to smallest unused index
    parent.index = index
    # lowlink is roughly the smallest index of any node known to be reachable from the vertex
    parent.lowlink = index
    index += 1
    stack << parent
    # loop through all vertices connected to parent
    adj_vertices(parent.name, adj_lists).each do |adj_vertex|
      # since adj_vertices returns array of strings,
      # must convert to TarjanVertex objects
      child = vertices.select {|v| v.name == adj_vertex}.first

      if child.index == -1  # if child vertex not yet visited
        tarjan_dfs(child, scc, stack, index, vertices) # recurse on child

        # change parent's lowlink to smaller lowlink of parent and child)
        parent.lowlink = [parent.lowlink, child.lowlink].min

      # vertex points to earlier (already visited) one in stack,
      # with lower index. thus it's the current SCC
      elsif stack.include?(child)
        parent.lowlink = [parent.lowlink, child.index].min
      end
    end

    # if a vertex's lowlink = its index here, this # cannot go any lower.
    # vertex MUST be root of the SCC.
    if parent.lowlink == parent.index
      component = []  # a single SCC

      # pop off entire SCC, one vertex at a time
      begin
        last_stack_item = stack.pop
        component << last_stack_item.name
      end while last_stack_item != parent # we're back at the root
      scc << component.sort # done with a single SCC
    end
  end

Solution

  • I solved my own problem! After going through each loop of my code, with pen and paper, I found that it prematurely went to the bottom component loop at vertex 4. parent.lowlink should NOT equal parent.index at that point. I just had to change 1 word to fix my problem!

    I changed "child.index" to "child.lowlink" in the "elsif stack.include?(child)" loop! This correctly dropped 4's lowlink to match that of vertex 6.

    Since then parent.lowlink != parent.index, it does not prematurely start making a new component.

    Interestingly, my solution is different from all the pseudocode and code online I've found on Tarjan's algorithm, which all says "parent.lowlink = [parent.lowlink, child.index].min"

    Instead, I needed "parent.lowlink = [parent.lowlink, child.lowlink].min"