c++linked-listradix-sort

Make RadixSort with linked lists on C++


I'm learning C++ and learning linked lists. I'm currently trying to make a radix sort for this type of lists, but my approach is not working so I was wondering if anyone could advice me on how to do it. Here's my code:

void simpleList::radixSort() {
    for (int i = 0; i < numberOfDigits(); i++){
        Node * tmp = firstNode;
        for (int j = 0; j < counter(); j++){
            int pow = 10;
            for (int k = 0; k < 10; k++){
                if (tmp -> data % pow == k){
                    insertFirst(tmp->data);
                    tmp = tmp -> next;
                }
                pow= pow * 10;
            }
        }
    }

}

The function numberOfDigits() gets you the amount of digits on the max number of the list, and counter() the amount of elements in the list. Also insertFirst() puts the number at the beggining of the list.


Solution

  • A few notes to help you on your way:

    Radix Sort

    A radix sort is any kind of (stable n-ary bucket sort) repeated until you run out of digits in your sort keys. Make your life simple: use binary as your radix.

    Keys

    You are using int values in your list, but that need not be the case. In general, you need a value→key function that returns an integer key for each element of your list, where “equivalent” elements return the same key. For a list of integers the value→key function is the identity function, so easy enough.
    For convenience, I will use lambdas for the value→key functions below.

    Reducing Passes

    You can reduce the number of times you bucket sort to only those bits that are not the same for all keys. Before the first pass through your list, you do not know anything about the keys, so we can gather information at the same time as the first bucket sort:

    key_type key_mask       =  0; // bits to sort
    key_type common_mask    = ~0; // (bits set in all keys)
    auto value_to_bucket_fn = [&]( const T & element )
    {
      // here we gather information about the keys
      key_type key = user_supplied_value_to_key_fn( element );
      key_mask    |= key;
      common_mask &= key;
      // here we return our first bucket key == bit 0
      return key & 1;
    };
    
    binary_bucket_sort( value_to_bucket_fn );
    

    Once we have made that first pass, we can get a bitmask indicating which bits need sorting in our keys:

    key_mask ^= common_mask;  // key_mask ← set bits == bits to sort
    

    All remaining passes can now be managed with a simple loop, ending when we run out of bits needing sorting:

    for (int shift = 1;  key_mask >>= 1;  shift++)
      if (key_mask & 1)
        binary_bucket_sort( [&]( const T & element )
        {
          return (user_supplied_value_to_key_fn( element ) >> shift) & 1;
        } );
    

    Bucket Sort

    A linked list is perfect for bucket sorts of large, unwieldly objects. Remember: the bucket sort must be stable, meaning it must not mix up the order of “equivalent” elements. This is imperative for a radix sort to work properly.

    For a binary bucket sort over a linked list, life is pretty simple — only two buckets are necessary, and you only need keep track of the last node in each bucket to append.

    If you are using a doubly-linked list that bookkeeping is done for you. If you are using a singly-linked list you will have to do it manually.

    Node   heads[2] = { Node{},    Node{}    };
    Node * tails[2] = { &heads[0], &heads[1] };
    
    while (head)
    {
      int bucket = value_to_bucket_fn( head->data );
      tails[bucket]->next = head;  // append current head to correct bucket
      tails[bucket]       = head;  // bucket’s new tail is the current head
      head = head->next;  // pop current head; get new current head
    }
    

    Hopefully you can see how easy this would be to expand to any number of buckets. Still, we will stick with two.

    Once you have split all the nodes into the two buckets, just join them back together into your new complete list. Don’t forget to clean up the tail’s next pointer!

    head = heads[0]->next;
    tails[0]->next = heads[1]->next;
    tails[1]->next = nullptr;
    

    Be sure to check out trincot’s answer to see how he defined his singly-linked list with a lastNode pointer and useful functions to make all this splitting into buckets (“partitions”) and joining the final list into invocations of very inexpensive member functions.

    Generics

    This answer spends some time going on about keys and non-integer values. I have defined my list type’s nodes as:

    struct node_type
    {
      T           data;
      node_type * next;
    };
    

    And I have defined the sorting functions as:

    template <typename ValueToBucket>
    void binary_bucket_sort( ValueToBucket value_to_bucket );
      
    template <typename ValueToKey>
    void radix_sort( ValueToKey value_to_key );
    

    Then, when I sorted my test lists, I used variations of:

    list <int> xs;
    ...
    xs.radix_sort( []( int x ) { return x; } );
    

    You can do things like observe the stability in the sort by playing with the value→key function (lambda). For example, I could define a list of integers where the one’s digit didn’t matter:

    xs.radix_sort( []( int x ) { return x / 10; } );
    

    Or a list of floats where the fractional part only mattered to two decimal places:

    xs.radix_sort( []( double x ) { return static_cast <long long> ( x * 100 ); } );
    

    I could have a list of Student where I am only interested in sorting by the student’s ID:

    xs.radix_sort( []( const Student & student ) { return student.ID; } );
    

    As long as the value→key function returns a sufficiently unique integer value we can use radix sort.