Just write a small omp test, and it does not work correctly all the times:
#include <omp.h>
int main() {
int i,j=0;
#pragma omp parallel
for(i=0;i<1000;i++)
{
#pragma omp barrier
j+= j^i;
}
return j;
}
The usage of j
for writing from all threads is incorrect in this example, BUT
there must be only nondeterministic value of j
I have a freeze.
Compiled with gcc-4.3.1 -fopenmp a.c -o gcc -static
Run on 4-core x86_Core2 Linux server: $ ./gcc
and got freeze (sometimes; like 1 freeze for 4-5 fast runs).
Strace:
[pid 13118] futex(0x80d3014, FUTEX_WAKE, 1) = 1
[pid 13119] <... futex resumed> ) = 0
[pid 13118] futex(0x80d3020, FUTEX_WAIT, 251, NULL <unfinished ...>
[pid 13119] futex(0x80d3014, FUTEX_WAKE, 1) = 0
[pid 13119] futex(0x80d3020, FUTEX_WAIT, 251, NULL
<freeze>
Why do I have a freeze (deadlock)?
Try making i private so each loop has it's own copy.
Now that I have more time, I will try and explain. By default variables in OpenMP are shared. There are a couple of cases where there are defaults that make variables private. Parallel regions is not one of them (so High Performance Mark's response is wrong). In your original program, you have two race conditions - one on i and one on j. The problem is with the one on i. Each thread will execute the loop some number of times, but since i is being changed by each thread, the number of times any thread executes the loop is indeterminate. Since all threads have to execute the barrrier for the barrier to be satisfied, you come up with the case where you will get a hang on the barrier which will never end, since not all threads will execute it the same number of times.
Since the OpenMP spec clearly states (OMP spec V3.0, section 2.8.3 barrier Construct) that "the sequence of worksharing regions and barrier regions encountered must be the same for every thread in a team", your program is non-compliant and as such can have indeterminate behavior.