OpenMP-4.5 supports reduction of raw arrays reduction(+:array[:])
, analogous to previous scalar reduction reduction(+:scalar)
There is plenty of information on how a scalar to be reduced can be initialized privately for each thread (things like omp_priv
and omp_orig
), but I cannot find much on the initialization of private arrays.
I wonder whether the private array of each thread is initialized as
(a) a copy of the original array defined above the parallel block or
(b) an all-zero array?
It depends on the reduction type.
For addition, the initial array is all-zero.
For multiplication, the initial array is all-one.
The choice of the initial array probably follows Table 2.11 of, just like scalar reduction.
If this is not what you want, you can also declare your own reduction strategy with the keywords omp_out
, omp_in
, omp_priv
and omp_orig
The directives such as omp_priv=omp_orig
seem to be applied to each element of the array as an individual scalar.
#include <iostream>
#include <omp.h>
#define NTHREADS 2
#define SIZE 3
#define NLOOPS 4
int main(){
int* array = new int[SIZE];
for ( int i = 0; i < SIZE; i++ ) array[i] = i;
// Initial array 0 1 2
// Addition
#pragma omp parallel for reduction(+:array[:SIZE]) num_threads(NTHREADS)
for ( int iiter = 0; iiter < NLOOPS ; iiter++ )
#pragma omp critical
for ( int j = 0; j < SIZE; j++ )
std::cout << array[j];
std::cout << std::endl;
// 000000000000
// Multiplication
#pragma omp parallel for reduction(*:array[:SIZE]) num_threads(NTHREADS)
for ( int iiter = 0; iiter < NLOOPS ; iiter++ )
#pragma omp critical
for ( int j = 0; j < SIZE; j++ )
std::cout << array[j];
std::cout << std::endl;
// 111111111111
// User-defined reduction
#pragma omp declare reduction(MySum: int: omp_out += omp_in) initializer(omp_priv = 2 * omp_orig)
#pragma omp parallel for reduction(MySum:array[:SIZE]) num_threads(NTHREADS)
for ( int iiter = 0; iiter < NLOOPS ; iiter++ )
#pragma omp critical
for ( int j = 0; j < SIZE; j++ )
std::cout << array[j];
std::cout << std::endl;
// 024024024024
return 0;
Tested on GCC-8.3.0 and ICC-