pythonsortingtime-complexitybubble-sort# How to properly count operations (time complexity) in a bubble sort algorithm in Python

### Time complexity

This is my code:

```
def sort_bubble(list):
comp_counter = 0
swap_counter = 0
n = len(list)
for i in range(n-1):
for j in range(n-i-1):
comp_counter += 1
if list[j] > list[j+1]:
swap_counter += 1
list[j], list[j+1] = list[j+1], list[j]
return comp_counter, swap_counter
```

It needs to bubble sort a list, and then return the proper amount of operations (time complexity).

I ran this code in two cases (each list had 20000 intergers):

a) Sorting an already sorted list (best case scenario): **Number of comparsions = 49995000 = n(n-1)/2**; **Number of swaps = 0**. Seems to me like something is wrong here, because number of comparsions in a best case scenario should be **O(n)** and the number of swaps **O(1)** (for a bubble sort algorithm).

b) Sorting a "backwards" sorted list (worst case scenario) : **Number of comparsions = 49995000 = n(n-1)/2**; **Number of swaps = 49993762**. In a worst case scenario this algorithm should have **O(n^2)** swaps and **O(n^2)** comparsions. Meanwhile the number of swaps and comparsions is something closer to **O(n^2)/2**

It is obvious to me that either my code, or my understanding of time complexity in algorithms is completely scuffed. Probably both. What am I doing wrong?

Solution

return the proper amount of operations (time complexity).

The number of operations you get from running a program are not equivalent to its time complexity. Time complexity is something that you derive from an *analysis* that proves that this number of operations is bounded by a function of 𝑛 for all great enough values of 𝑛.

best case scenario ... worst case scenario: Number of comparsions = 49995000 = n(n-1)/2

The comparison counter is increased in a way that does not depend on the *contents* of the input list. Its final value *only* depends on 𝑛. And so there is no best or worst case scenario for it. It will always end up as 𝑛(𝑛−1)/2.

If you expected to see a difference, then this means you were supposed to implement a version of bubble sort that detects that the input list is sorted and can exit earlier. See also Wikipedia on bubble sort, and how in one version it uses a `swapped`

boolean variable, and in another `n`

and `newn`

. That latter version identifies in a given iteration of the outer loop where the last swap occurs, and will conclude that all values from that index onwards are in their final, sorted position and do not need to be compared again.

Unrelated, but don't call your list `list`

, as that is the name of a native Python function.

Here is how your code could be adapted:

```
def sort_bubble(lst):
comp_counter = 0
swap_counter = 0
n = len(lst)
sorted_from = n
while sorted_from > 0:
last_swap_index = 0
for j in range(sorted_from - 1):
comp_counter += 1
if lst[j] > lst[j+1]:
swap_counter += 1
lst[j], lst[j+1] = lst[j+1], lst[j]
last_swap_index = j + 1
sorted_from = last_swap_index
return comp_counter, swap_counter
```

For the best case scenario you'll now get a number of comparisons that is the minimum possible.

Time complexity is something you get from an analysis. Here we can prove that the number of comparisons is θ(𝑛) in the best case and θ(𝑛²) in the worst case. The number of swaps has a best case of θ(1) and a worst case of θ(𝑛²).

Meanwhile the number of swaps and comparisons is something closer to O(n^2)/2

This confirms the misunderstanding of what time complexity is. O(𝑛²)/2 -- or better put, O(𝑛²/2) -- is the same as O(𝑛²). You can find more information about this at Wikipedia on Big O notation.

- Activating Anaconda Environment in VsCode
- How to implement the Softmax function in Python?
- Dataframe - Select the minimum set of rows to cover all possible values of each columns
- Non-equi join in polars
- Tkinter and matplotlib - NavigationToolbar2Tk - cursor position precision
- Media Image is not loaded in the output of Django Project
- selenium upload part not working on windows. Path related?
- How to encode UTF8 filename for HTTP headers? (Python, Django)
- Split columns in Csv file
- In the Django admin site, how do I change the display format of time fields?
- Condition in pytest fixture depending on some test-suite results
- How to make a class JSON serializable
- Apache Airflow unit and integration test
- TypeError: 'MultiPolygon' object is not iterable with Plotly Choropleth
- How do I get the current time in milliseconds in Python?
- Python: Decoding base64 encoded strings
- How to fix the error when try to plot data with pandas?
- Python Image - Finding largest branch from image skeleton
- Enumerate is Returning A None Type
- Python finite difference functions?
- How to extract hex color codes from a colormap
- Regex: Get "London, UK" from "Evolution Recruitment (Agency) (London, UK)"
- Visual Studio Code - How to add multiple paths to python path?
- What are the Spark transformations that causes a Shuffle?
- How to find the second lowest lists into a nested list by their second value?
- python-docx: Parse a table to Panda Dataframe
- Deposit and Withdraw on Account
- Why does this code snippet give the error list object is not callable?
- How to find out if argparse argument has been actually specified on command line?
- Print current call stack from a method in code