graphcomplexity-theorydepth-first-searchcycle-detection

Time complexity for detecting a cycle in a graph


I am trying to understand the time complexity of some efficient methods of detecting cycles in a graph.

Two approaches for doing this are explained here. I would assume that time complexity is provided in terms of worst-case.
The first is union-find, which is said to have a time complexity of O(Vlog E).
The second uses a DFS based approach and is said to have a time complexity of O(V+E). If I am correct this is a more efficient complexity asymptotically than O(Vlog E). It is also convenient that the DFS-based approach can be used for directed and undirected graphs.

My issue is that I fail to see how the second approach can be considered to run in O(V+E) time because DFS runs in O(V+E) time and the algorithm checks the nodes adjacent to any discovered nodes for the starting node. Surely this would mean that the algorithm runs in O(V2) time because up to V-1 adjacent nodes might have to be traversed over for each discovered node? It is obviously impossible for more than one node to require the traversal of n-1 adjacent nodes but from my understanding this would still be the upper bound of the runtime.

Hopefully someone understands why I think this and can help me to understand why the complexity is O(V+E).


Solution

  • The algorithm, based on DFS, typically maintains a "visited" boolean variable for each vertex, which contains one bit of information - this vertex was already visited or not. So, none of vertices can be visited more than once.

    If the graph is connected, then starting the DFS from any vertex will give you an answer right away. If the graph is a tree, then all the vertices will be visited in a single call to the DFS. If the graph is not a tree, then a single call to the DFS will find a cycle - and in this case not all the vertices might be visited. In both cases the subgraph, induced by all the already visited vertices, will be a tree at each step of the DFS lookup - so the total number of traversed edges will be O(V). Because of that we can reduce the time complexity estimate O(V+E) of the cycle detection algorithm to O(V).

    Starting the DFS from all vertices of the graph is necessary in the case when the graph consists of a number of connected components - the "visited" boolean variable guarantees that the DFS won't traverse the same component again and again.