pythonsortinglinked-listbisect

Python, why is bisect.insort much faster than my linkedlist


So I am trying to keep a list in order at all times. So whenever new data comes in, I will insert this to the 'sorted list'.

Question, why is bisect.insort much faster than my linked list implementation. I know bisect search takes O(logn), but due to the insertion to the list, it really takes O(n). Where as the linked list implementation should also be O(n). Inserting new values in the sorted linked list should also be O(n). But why is the time comparison much much slower? Is my linked list implementation not optimized?

Here's my sample code:

import timeit
import bisect
import random


# Testing parameters
NUM_ITERATION_TEST = 10
TOTAL_NUM_DATA = 10000
DATA = [random.randint(0, 1000) for x in range(TOTAL_NUM_DATA)]


class Node():
    def __init__(self, val):
        self.val = val
        self.next = None


class LinkedListIterator():
    def __init__(self, head):
        self.current = head

    def __iter__(self):
        return self

    def __next__(self):
        if not self.current:
            raise StopIteration
        else:
            val = self.current.val
            self.current = self.current.next
            return val


class LinkedList():
    def __init__(self):
        self.head = None

    def __iter__(self):
        return LinkedListIterator(self.head)

    def insert(self, val):
        new_node = Node(val)

        if self.head is None:
            self.head = new_node
            return

        curr = self.head
        if curr.val > val:
            new_node.next = curr
            self.head = new_node
            return

        while curr.next:
            if curr.next.val > val:
                break
            curr = curr.next

        new_node.next = curr.next
        curr.next = new_node


def method1(DATA):
    sorted_list = []
    for num in DATA:
        bisect.insort_right(sorted_list, num)


def method2(DATA):
    sorted_list = LinkedList()
    for num in DATA:
        sorted_list.insert(num)


if __name__ == "__main__":
    # METHOD 1
    print("Method 1 Execution Time:")
    print(timeit.timeit("test_timeit.method1(test_timeit.DATA)",
                        number=NUM_ITERATION_TEST,
                        setup="import test_timeit"))

    # METHOD 2
    print("Method 2 Execution Time:")
    print(timeit.timeit("test_timeit.method2(test_timeit.DATA)",
                        number=NUM_ITERATION_TEST,
                        setup="import test_timeit"))

The execution times are:

Method 1 Execution Time:
0.11593010000000001
Method 2 Execution Time:
33.0651346

I also tried using other implementations like sorted dicts, but nothing really beats the bisect implementation. Is there a more efficient implementation? Basically want an always sorted list of data, where I would constantly add/insert new data to the list..


Solution

  • Your own implementation is executed by the Python interpreter, creating and linking dynamic run-time objects with a lot of superfluous validity checks, one object creation or deletion at a time. The built-in function is optimized in C, already complied. It can allocate memory in larger chunks, manufacture new objects in a single struct mapping, avoid many of the validity checks, ...

    A C-based built-in will (almost) always beat anything you can program in Python.