python algorithm search graph-theory depth-first-search

When to mark visited in iterative DFS? at push or at pop

I'm working on implementing the iterative (non-recursive) version of dfs. I'm confused when to mark a node as visited: at push or at pop.

In bfs, we mark visited when pushing it to queue, so it seems naturally that we should also do the same in dfs. In my understanding, marking at push (earlier stage than pop) also seem to reduce duplicated finding because the same point would have the same result and thus improve speed.

However, I've learnt that a more common and preferred way is to mark at pop. I couldn't think of any advantages of this way. They should both be, however, able to find a solution and only differ in paths.

So are both ways accepted as correct dfs and why marking at pop is preferred?

sample code for mark at pop:

def dfs()
    visited = set()
    stack.push(start)
    while not stack.isEmpty():
        now = stack.pop()
    
        visited.add(now)  # Mark visited at pop
    
        if isGoal(now):
            return "Found"
        for nbr in find_neighbours(now):
            if not nbr in visited:
                stack.push(nbr)
    return "NotFound"

sample code for mark at push:

def dfs()
    visited = set(start)
    stack.push(start)
    while not stack.isEmpty():
        now = stack.pop()
        if isGoal(now):
            return "Found"
        for nbr in find_neighbours(now):
            if not nbr in visited:
                stack.push(nbr)
    
                visited.add(nbr)  # Mark visited at push
    
    return "NotFound"

Edit:

Marking at pop can be indeed the preferred way for dfs if we choose to "check visited" at pop as well. Instead of checking neighbours by if not nbr in visited:, we can check the node to be expanded instead by if not now in visited:. Compared with my original code of checking visited at push, checking visited at pop seems safer and correct because we just don't expand it but before that we still add it to stack. The reason why we mark visited at pop is as @MattTimmermans said.

Solution

There's a trade-off involved here.

As @trincot said in his answer, marking nodes when you push them onto the stack can be more efficient, since it ensures that each node will only be pushed, and therefore processed, once.

However, if you do that, then your algorithm is no longer DFS. Consider the graph:

     A
   / | \
  B  D  \
 /      /
C -----/

When you process A, you will push B, D, C. Then you will process B. C comes next, but you can't push it onto the stack if you already marked it when you processed A.

To make DFS work in an implementation like this where you push all the children of a node at once, you need to do the marking AND checking when you pop nodes off the stack. Then your algorithm will process all nodes in preorder, that is DFS where a node is processed before its children.

The difference is really that the nodes on the stack are not nodes. They are edges. Each edge is pushed onto the stack exactly once, when its first parent edge is processed. The parent, however, just doesn't have to be remembered, so you only use the stack to remember the its child node.

The most useful way to think about it is that the representation of a node on the stack needs to capture the remaining work that's left to do after the node is processed. Then in each iteration, you:

Look at the work on the stack
Remove the next job, updating or removing the "remaining work" record
Do the job you removed. If it's a "process a node" job, then the remaining work is "process the children of this node".
Go back to 1

In your current implementation style, the "remaining work" for a node on the stack is just a stack entry for each of its children that you have yet to process.

If you want a more efficient implementation where you only push each node once, then the thing you push to capture "remaining work" needs to be more complicated, like an iterator for the children.