linuxalgorithmsortingg++stdlist

What makes the gcc std::list sort implementation so fast?


I have a linked list implementation and I'm experimenting with both MergeSort and QuickSort algorithms.

What I don't understand is why the sort operation in std::list is so fast. Looking at the std::list under Linux and it appears to be linked list as well, not an array based list.

The merge sort I tried almost identical to Dave Gamble's version here: Merge Sort a Linked List

Also, I thought I'd try a simple quicksort based on this code: http://www.flipcode.com/archives/Quick_Sort_On_Linked_List.shtml

Surprisingly, to sort 10 million random numbers using std::list and sort was around 10 times quicker than either of those others.

And for those that are asking, yes I need to use my own list class for this project.


Solution

  • I've been taking a look at the interesting libstdc++ (GCC) implementation for list::sort (source code) and it doesn't seem to implement a traditional merge sort algorithm (at least not one I've ever seen before).

    Basically what it does is:

    1. Creates a series of buckets (64 total).
    2. Removes the first element of the list to sort and merges it with the first (i=0th) bucket.
    3. If, before the merge, the ith bucket is not empty, merge the ith bucket with the i+1th bucket.
    4. Repeat step 3 until we merge with an empty bucket.
    5. Repeat step 2 and 3 until the list to sort is empty.
    6. Merge all the remaining non-empty buckets together starting from smallest to largest.

    Small note: merging a bucket X with a bucket Y will remove all the elements from bucket X and add them to bucket Y while keeping everything sorted. Also note that the number of elements within a bucket is either 0 or 2^i.

    Now why is this faster then a traditionnal merge sort? Well I can't say for sure but here are a few things that comes to mind:

    I'm pretty sure the folks who implemented this algorithm tested it thoroughly so if you want a definitive answer you'll probably have to ask on the GCC mailing list.