c++multithreadingshared-datastd-future

Multithreading with std::future in C++: Accessing shared data


I am currently developing a multi-threaded application in C++ where different threads are expected to process data from a shared data structure. I'm aware that the standard library provides std::future and std::async to easily handle asynchronous operations, and I'm trying to use these in my application.

Here's a simplified sketch of my code:

#include <vector>
#include <future>

std::vector<int> shared_data;

// Some function to be executed asynchronously
void process_data(size_t start, size_t end) {
    for (size_t i = start; i < end; ++i) {
        // Do something with shared_data[i]
    }
}

int main() {
    std::future<void> fut1 = std::async(std::launch::async, process_data, 0, 10);
    std::future<void> fut2 = std::async(std::launch::async, process_data, 10, 20);

    // Other operations...

    return 0;
}

I have the following questions regarding this code:

Since shared_data is being accessed by multiple threads, do I need to protect it with a std::mutex or other synchronization primitives? Is there a way to pass std::future objects to other functions or store them in a data structure, and what would be the potential implications of doing so? How can I handle exceptions thrown by the process_data function and propagated through the std::future objects? Any guidance or best practices related to the usage of std::future in multithreaded scenarios would be greatly appreciated.

In order to make the shared data access thread-safe, I attempted to introduce an std::mutex and lock it using std::lock_guard in the process_data function like so:

std::mutex mtx;

void process_data(size_t start, size_t end) {
    std::lock_guard<std::mutex> lock(mtx);
    for (size_t i = start; i < end; ++i) {
        // Do something with shared_data[i]
    }
}

I also attempted to store std::future objects in a std::vector for later use, and tried to handle exceptions using a try/catch block around the get() function of std::future.

I was expecting that locking the std::mutex would ensure that only one thread can access the shared data at a time, preventing race conditions. I also expected that I would be able to easily store the std::future objects in a vector and handle exceptions from the asynchronous tasks.

However, I'm unsure if these methods are the most efficient or even correct, given the lack of detailed examples or guidelines on these topics in the documentation and tutorials I've found. I'm particularly interested in understanding the correct way to use std::future and std::async in more complex scenarios, and how to handle exceptions properly in this context.


Solution

  • If the data is readonly (and its not too much, just copy it). Otherwise make a shared_ptr to your data (and using a lambda expression) you can capture the shared_ptr by value (! not reference!!!) This will extend the lifetime of the data to the lifetime of the thread that uses it longest. So something like this : std::shared_ptr<SharedData> data; auto future = std::async(std::launch::async( [data]{ process_data(data); };

    If the data is read/write then add a mutex or some other synchronization mechanism to your data class and use getters/setters with lock to update/read the values in the data.