pythonalgorithmkosaraju-sharir

kosaraju finding finishing time using iterative dfs


here is the first part of the code that i have did for Kosaraju's algorithm.

###### reading the data #####
with open('data.txt') as req_file:
        ori_data = []
        for line in req_file:
            line = line.split()
            if line:
                line = [int(i) for i in line]
                ori_data.append(line)

###### forming the Grev ####
revscc_dic = {}
for temp in ori_data:
    if temp[1] not in revscc_dic:
        revscc_dic[temp[1]] = [temp[0]]
    else:
        revscc_dic[temp[1]].append(temp[0])

print revscc_dic        

######## finding the G#####
scc_dic = {}
for temp in ori_data:
    if temp[0] not in scc_dic:
        scc_dic[temp[0]] = [temp[1]]
    else:
        scc_dic[temp[0]].append(temp[1])

print scc_dic        

##### iterative dfs ####
path = []
for i in range(max(max(ori_data)),0,-1):
    start = i
    q=[start]
    while q:
        v=q.pop(0)
        if v not in path:
          path.append(v)
          q=revscc_dic[v]+q
print path  

The code reads the data and forms Grev and G correctly. I have written a code for iterative dfs. How can i include to find the finishing time ?? I understand finding the finishing time using paper and pen but I do not understand the part of finishing time as a code ?? how can I implement it.. Only after this I can proceed my next part of code. Pls help. Thanks in advance.

The data.txt file contains:

1 4
2 8
3 6
4 7
5 2
6 9
7 1
8 5
8 6
9 7
9 3

please save it as data.txt.


Solution

  • With recursive dfs, it is easy to see when a given vertex has "finished" (i.e. when we have visited all of its children in the dfs tree). The finish time can be calculated just after the recursive call has returned.
    However with iterative dfs, this is not so easy. Now that we are iteratively processing the queue using a while loop we have lost some of the nested structure that is associated with function calls. Or more precisely, we don't know when backtracking occurs. Unfortunately, there is no way to know when backtracking occurs without adding some additional information to our stack of vertices.

    The quickest way to add finishing times to your dfs implementation is like so:

    ##### iterative dfs (with finish times) ####
    path = []
    time = 0
    finish_time_dic = {}
    for i in range(max(max(ori_data)),0,-1):
        start = i
        q = [start]
        while q:
            v = q.pop(0)
            if v not in path:
                path.append(v)
                q = [v] + q
                for w in revscc_dic[v]:
                    if w not in path: q = [w] + q
            else:
                if v not in finish_time_dic:
                    finish_time_dic[v] = time
                    time += 1
    print path  
    print finish_time_dic
    

    The trick used here is that when we pop off v from the stack, if it is the first time we have seen it, then we add it back to the stack again. This is done using: q = [v] + q. We must push v onto the stack before we push on its neighbours (we write the code that pushes v before the for loop that pushes v's neighbours) - or else the trick doesn't work. Eventually we will pop v off the stack again. At this point, v has finished! We have seen v before, so, we go into the else case and compute a fresh finish time.

    For the graph provided, finish_time_dic gives the correct finishing times:

    {1: 6, 2: 1, 3: 3, 4: 7, 5: 0, 6: 4, 7: 8, 8: 2, 9: 5}
    

    Note that this dfs algorithm (with the finishing times modification) still has O(V+E) complexity, despite the fact that we are pushing each node of the graph onto the stack twice. However, more elegant solutions exist. I recommend reading Chapter 5 of Python Algorithms: Mastering Basic Algorithms in the Python Language by Magnus Lie Hetland (ISBN: 1430232374, 9781430232377). Question 5-6 and 5-7 (on page 122) describe your problem exactly. The author answers these questions and gives an alternate solution to the problem.

    Questions:

    5-6 In recursive DFS, backtracking occurs when you return from one of the recursive calls. But where has the backtracking gone in the iterative version?

    5-7. Write a nonrecursive version of DFS that can deal determine finish-times.

    Answers:

    5-6 It’s not really represented at all in the iterative version. It just implicitly occurs once you’ve popped off all your “traversal descendants” from the stack.

    5-7 As explained in Exercise 5-6, there is no point in the code where backtracking occurs in the iterative DFS, so we can’t just set the finish time at some specific place (like in the recursive one). Instead, we’d need to add a marker to the stack. For example, instead of adding the neighbors of u to the stack, we could add edges of the form (u, v), and before all of them, we’d push (u, None), indicating the backtracking point for u.