c++multithreadingsorting

How to use every thread once in a recursive sorting algorithm?


In the following implementation of the quicksort algorithm, I'm trying to make use of every thread available in the system:

#include <iostream>
#include <vector>
#include <thread>

void quicksort(std::vector<double> & v, int const begin, int const end){

    /* Algorithm omitted for brevity's sake here  */

    
     // The following recursive function calls are the intended target of multithreading

    quicksort(v, begin, i-1);
    quicksort(v,i+1,end);
    
    return;
};


int main() {
    std::vector<double> vect1; /* Initialization to 30 million random elements omitted */

    const unsigned int n_threads = std::thread::hardware_concurrency();
    std::vector<std::thread> threads(n_threads);
    unsigned int count = 0;

    quicksort(vect1,0,vect1.size()-1);

    return 0;
}

I am stuck at the code above. My question is how to execute every thread at least once during the recursive calls?

I am aware that if the function to multithread wasn't recursive, you do this:

// ...

void task() {
    // some task here
}

int main() {
    const unsigned int n_threads = std::thread::hardware_concurrency();
    std::vector<std::thread> threads(n_threads);

    // Execute the same function in every thread
    for (int i = 0; i < n_threads; ++i) {
        threads[i] = std::thread(task);
    }

    for (int i=0; i<n_threads; i++){
        threads[i].join();
    } // Let every thread finish

    return 0;
}


However, the recursive nature of the algorithm complicates this. Maybe some kind of data structure outside the first call of quicksort(...) is needed to keep track of the threads that are already-exectuted, and the yet-to-be-executed ones?


Solution

  • I have been able to reach a solution through the utilization of OpenMP (thanks to @JérômeRichard for suggesting this tool to me).

    #include <iostream>
    #include <iomanip>
    #include <ios>
    #include <vector>
    #include <random>
    #include <omp.h>
    
    void quicksort(std::vector<double> & v, int const begin, int const end){
        
        /*
        Quicksort algorithm omitted for brevity.
        (As with every implementation, element swaps
        are made, and the necessary index i is
        created and initialized correctly)
        */
    
            #pragma omp task shared(v,begin,i)
            {
        quicksort(v, begin, i-1);
        }
            
            #pragma omp task shared(v,i,end)
            {
        quicksort(v,i+1,end);
        }
          
        return;
    };
    
    
    int main() {
        std::vector<double> vect1; // Initialization of vect1 to several million random elements omitted
    
        #pragma omp parallel
        {
          #pragma omp single
          { quicksort(vect1,0,vect1.size()-1); } 
        }
    
        // Omitted here: a for-loop to print some of the elements of vect1 to check whether the sort was successful
    
        return 0;
    }
    

    You will notice that the following expressions from OpenMP are used:

    #pragma omp parallel
    #pragma omp single
    #pragma omp task shared(v,begin,i)
    #pragma omp task shared(v,i,end)
    

    And the header <omp.h> is included.

    This is compiled with g++, and it is necessary to add the flag -fopenmp to allow for the use of OpenMP tools in the code:

    g++ -W -Wall -pedantic -fopenmp -o ProgramName -p FileName.cpp
    

    Note that OpenMP also contains another construct called Sections (e.g., #pragma omp sections), but my understanding is that Tasks (e.g., #pragma omp task) supersede Sections.