pythonalgorithmgraphgraph-algorithmkosaraju-algorithm

Implementing Kosaraju's Algorithm for SCC's


I am trying to implement Kosaraju's algorithm to find strongly connected components of a directed graph in linear time. The goal is to store and output the list of SCC sizes.

class Graph:
  def __init__(self, edge_list, num_nodes):
    self.graph = edge_list
    self.rev_graph = self.reverse_graph()
    self.num_nodes = num_nodes

    self.traversed_nodes_p1 = []
    self.traversed_nodes_p2 = []

    self.counter = 0;
    self.scc_size_list = []

    self.scc_sl()

  def reverse_graph(self):
    r_graph = [];
    for e in self.graph:
      tail = e[0]; 
      head = e[1];
      r_graph.append([head, tail])
    r_graph = sorted(r_graph, key = lambda x: x[0])
    return r_graph

  def dfs_p1(self, starting_node, g, t_nodes): 
    if (starting_node not in t_nodes):
      t_nodes.insert(0, starting_node)
    for i in range (len(g)): 
      if (g[i][0] == starting_node and g[i][1] not in t_nodes):
        self.dfs_p1(g[i][1], g, t_nodes)
  
  def dfs_loop_p1 (self):
    for i in range (1, self.num_nodes+1):
      self.dfs_p1(i, self.rev_graph, self.traversed_nodes_p1)

  def dfs_p2(self, starting_node, g, t_nodes): 
    if (starting_node not in t_nodes):
      self.counter += 1
      t_nodes.append(starting_node)
    for i in range (len(g)): 
      if (g[i][0] == starting_node and g[i][1] not in t_nodes):
        self.dfs_p2(g[i][1], g, t_nodes)
  
  def dfs_loop_p2 (self):
    for i in self.traversed_nodes_p1:
      self.counter = 0
      self.dfs_p2(i, self.graph, self.traversed_nodes_p2)
      if self.counter > 0:
        self.scc_size_list.append(self.counter)
  
  def scc_sl (self):
    self.dfs_loop_p1()
    self.dfs_loop_p2()

    self.scc_size_list.sort()
    self.scc_size_list.reverse()

This code produces the correct output for smaller graphs (Ex. 9 nodes, 15 edges)

test_edge_list = [[1,2], [1,9], [2,3], [2,8], [3,4], [3,5], [4,5], [6,5], [6,8], [7,6], [7,9], [8,3], [8,7], [9,2], [9,8]]
test = Graph(test_edge_list, 9)
print(test.scc_size_list)

but takes a really long time to work for larger graphs. How can I optimize this?


Solution

  • Great question. Your implementation seems relevantly optimized in terms of speed. There not much you can do to improve your implementation of Kosaraju's algorithm. However, there are other algorithms to find SCCs in faster time. In particular, you should check out:

    These algorithms, like Kosaraju's algorithm, run in O(V+E). However, these other algorithms have smaller constants than Kosaraju's algorithm.