So I am trying to keep a list in order at all times. So whenever new data comes in, I will insert this to the 'sorted list'.
Question, why is bisect.insort much faster than my linked list implementation. I know bisect search takes O(logn), but due to the insertion to the list, it really takes O(n). Where as the linked list implementation should also be O(n). Inserting new values in the sorted linked list should also be O(n). But why is the time comparison much much slower? Is my linked list implementation not optimized?
Here's my sample code:
import timeit
import bisect
import random
# Testing parameters
NUM_ITERATION_TEST = 10
TOTAL_NUM_DATA = 10000
DATA = [random.randint(0, 1000) for x in range(TOTAL_NUM_DATA)]
class Node():
def __init__(self, val):
self.val = val
self.next = None
class LinkedListIterator():
def __init__(self, head):
self.current = head
def __iter__(self):
return self
def __next__(self):
if not self.current:
raise StopIteration
else:
val = self.current.val
self.current = self.current.next
return val
class LinkedList():
def __init__(self):
self.head = None
def __iter__(self):
return LinkedListIterator(self.head)
def insert(self, val):
new_node = Node(val)
if self.head is None:
self.head = new_node
return
curr = self.head
if curr.val > val:
new_node.next = curr
self.head = new_node
return
while curr.next:
if curr.next.val > val:
break
curr = curr.next
new_node.next = curr.next
curr.next = new_node
def method1(DATA):
sorted_list = []
for num in DATA:
bisect.insort_right(sorted_list, num)
def method2(DATA):
sorted_list = LinkedList()
for num in DATA:
sorted_list.insert(num)
if __name__ == "__main__":
# METHOD 1
print("Method 1 Execution Time:")
print(timeit.timeit("test_timeit.method1(test_timeit.DATA)",
number=NUM_ITERATION_TEST,
setup="import test_timeit"))
# METHOD 2
print("Method 2 Execution Time:")
print(timeit.timeit("test_timeit.method2(test_timeit.DATA)",
number=NUM_ITERATION_TEST,
setup="import test_timeit"))
The execution times are:
Method 1 Execution Time:
0.11593010000000001
Method 2 Execution Time:
33.0651346
I also tried using other implementations like sorted dicts, but nothing really beats the bisect implementation. Is there a more efficient implementation? Basically want an always sorted list of data, where I would constantly add/insert new data to the list..
Your own implementation is executed by the Python interpreter, creating and linking dynamic run-time objects with a lot of superfluous validity checks, one object creation or deletion at a time.
The built-in function is optimized in C, already complied. It can allocate memory in larger chunks, manufacture new objects in a single struct
mapping, avoid many of the validity checks, ...
A C-based built-in will (almost) always beat anything you can program in Python.