pythonalgorithmtime-complexitybreadth-first-searchspace-complexity# Two implementation methods of BFS for finding the shortest path, which one is the obvious winner?

### Remark on your code

### Comparison

There are two ways of implementing BFS to find the shortest path between two nodes. The first is by using a list of lists to represent the queue of paths. Another is to maintain a mapping from each node to its parent, and when inspecting the adjacent node, record its parent, finally do backtrace according to the parent mapping to find the path. (See this post for more details. https://stackoverflow.com/a/8922151/13886907. Thanks for qiao's answers and codes to that question!)

Copied them here: First way:

```
def bfs(graph, start, end):
# maintain a queue of paths
queue = []
# push the first path into the queue
queue.append([start])
while queue:
# get the first path from the queue
path = queue.pop(0)
# get the last node from the path
node = path[-1]
# path found
if node == end:
return path
# enumerate all adjacent nodes, construct a
# new path and push it into the queue
for adjacent in graph.get(node, []):
new_path = list(path)
new_path.append(adjacent)
queue.append(new_path)
print(bfs(graph, 'A', 'F'))
```

Second way:

```
def backtrace(parent, start, end):
path = [end]
while path[-1] != start:
path.append(parent[path[-1]])
path.reverse()
return path
def bfs(graph, start, end):
parent = {}
queue = []
queue.append(start)
while queue:
node = queue.pop(0)
if node == end:
return backtrace(parent, start, end)
for adjacent in graph.get(node, []):
if node not in queue :
parent[adjacent] = node # <<<<< record its parent
queue.append(adjacent)
print(bfs(graph, 'A', 'F'))
```

and the graph (directed graph)

```
graph = {'A': ['C', 'D', 'B'],
'B': ['C', 'E'],
'C': ['E'],
'D': ['F'],
'E': ['F']}
```

We can see that the second way can save the memory cost since the queue does not need to store the paths, and the space complexity for both the queue and parent map is O(V), where V is the number of vertices. And also, the final backtrace process costs at most extra O(V) time.

So, does the second way is superior in all aspects to the first way in finding the shortest or all paths between two nodes in a directed graph? Can we think of the second as an optimization of the basic version of BFS (first way)?

Solution

The second version is superior. This is because the memory allocation also costs time. Namely this line:

```
new_path = list(path)
```

...has a time complexity of O(k) in terms of the length of `path`

. Even in the *best* case, where the graph is actually just a single path from source to target node, the first code will spend O(1) + O(2) + O(3) + ... + O(n) on the executions of this `list(path)`

call, which is O(n²). The second version will be O(n) in this "happy path" case. Things only get worse when the branching factor in the graph becomes greater.

Both code snippets have issues:

The first version does not have protection against running in loops. You should add a visited-marker so that the same node is not visited twice

The second version

*seems*to have such protection, but it is not good enough. It checks whether the next node is already in the queue. But if even it is not, it could have previously been, and also in that case, it should not be revisited. We can use`parent`

to know whether the node was already visited.

So here are the corrected snippets:

```
def bfs_1(graph, start, end):
queue = []
visited = set()
queue.append([start])
visited.add(start)
while queue:
path = queue.pop(0)
node = path[-1]
if node == end:
return path
for adjacent in graph.get(node, []):
if adjacent not in visited:
visited.add(adjacent)
new_path = list(path)
new_path.append(adjacent)
queue.append(new_path)
```

And

```
def backtrace(parent, start, end):
path = [end]
while path[-1] != start:
path.append(parent[path[-1]])
path.reverse()
return path
def bfs_2(graph, start, end):
parent = {}
queue = []
queue.append(start)
parent[start] = None
while queue:
node = queue.pop(0)
if node == end:
return backtrace(parent, start, end)
for adjacent in graph.get(node, []):
if adjacent not in parent:
parent[adjacent] = node # <<<<< record its parent
queue.append(adjacent)
```

I used the following testing code to test the above algorithms:

```
import random
from timeit import timeit
def create_graph(size):
graph = {}
nodes = list(range(size))
for i in nodes:
graph[i] = set(random.choices(nodes, k=3))
if i in graph[i]:
graph[i].remove(i)
graph[i] = list(graph[i])
return graph
graph = create_graph(40000)
print("version 1")
print(bfs_1(graph, 1, 2))
print("time used", timeit(lambda: bfs_1(graph, 1, 2), number=10))
print()
print("version 2")
print(bfs_2(graph, 1, 2))
print("time used", timeit(lambda: bfs_2(graph, 1, 2), number=10))
```

The generated graph has 100 000 nodes, with a branching factor of approximately 2. The edges are random. Most of the time the second algorithm is faster than the first. This difference becomes more explicit when the solution path is longer.

- AttributeError: install_layout when attempting to install a package in a virtual environment
- Python list comprehension - want to avoid repeated evaluation
- Hash algorithm for dynamic growing/streaming data?
- matplotlib - making labels for violin plots
- Python How to I check if last element has been reached in iterator tool chain?
- Polars and the Lazy API: How to drop columns that contain only null values?
- Why are my Mean, Var, and Std outputs from NumPy different from what the online grader expects?
- Correlation dataframe convertion from results from pl.corr
- Polars DataFrame transformation
- Discord rate limiting while only sending 1 request per minute
- Check if column contains (/,-,_, *or~) and split in another column - Pandas
- How to draw a rectangle at (x,y) in a PyQt GraphicsView?
- how to calculate correlation between ten columns with polars
- How to set class attribute with await in __init__
- Detect hindi encoding, response received from Facebook API in Python
- Is it possible to write a horizontal if statement with a multi-line body?
- Max length of items in list
- Cannot subclass multiprocessing Queue in Python 3.5
- How can I get notified of updates to Python packages in a unified way?
- Using python AST to traverse code and extract return statements
- merge groups of columns in a polars dataframe to single columns
- Group Pandas DataFrame by Continuous Date Ranges
- Flask login @login_required not working
- Odoo: one2many and many2one? KeyError:'___'
- merge some columns in a Polars dataframe and duplicate the others
- Python: Create table from string mixed with separators using FOR loops
- How do I type hint a method with the type of the enclosing class?
- How can I verify an emails DKIM signature in Python?
- Writing a class that accepts a callback in Python?
- Python Paramiko channel.exec_command not returning output intermittently