c++parallel-processingc++17stdaccumulate

Why is there no parallel `std::accumulate` in the C++ standard?


I think it is confusing that there is no parallel version of std::accumulate in the C++ standard. It appears to me that it would be trivial to implement it in parallel, for instance, based on OpenMP or SIMD instructions. Does anybody have a good explanation for why the standard committee chose to introduce a parallel version of std::reduce but not std::accumulate? Is it because of the different iterator types?


Solution

  • They introduced reduce because accumulate cannot be parallelized as it was written.

    The C++ standard defines the expected behavior of the various functions. accumulate is defined as follows:

    Computes its result by initializing the accumulator acc with the initial value init and then modifies it with acc = std::move(acc) + *i or acc = binary_op(std::move(acc), *i) for every iterator i in the range [first, last) in order.

    Emphasis added. Those two words make the algorithm unparallelizable. And because those words are part of the standard, users of accumulate are allowed to rely on the "in order" guarantee. They can make operator+ or binary_op non-associative operations, ones that rely on the ordering of the operation. So the committee couldn't just take those words away without potentially breaking a bunch of code.

    And they didn't want to have a parallel version of accumulate that had fundamentally different behavior from the non-parallel version. Therefore, reduce was introduced without the "in order" guarantee (and with other language that helped make it parallelizable).