carraysmultithreadingopenmpcilk-plus

Parallelize function operating on a row of a 2D array (C, OpenMP, CilkPlus)


I am trying to parallelize the function calls within the for-loop inside main of test_function with OpenMP/CilkPlus (as shown in the C code). For each iteration, read/write operations occur on only one row of the 2d_array so there are no data dependencies within the iterations (2d_array is shared among the available threads, and i is private by default).

void test_function(int *in, int len)
{
    int i, value; 
    int *x, *low, *high;
    x = x_alloc + 4;
    for (i=0; i<len; i++)
        x[i] = in[i];

    for(i=1;i<=4;i++) 
    {
        x[-i] = x[i];
        x[(len-1) + i] = x[(len-1) - i];
    }

    high = malloc(sizeof(int)*((len>>1) + 1));

    for(i=0; i < ((len>>1) + 1); i++)
    {
        value = x[-4 + 2*i] + x[2 + 2*i] + x[-2 + 2*i] + x[2*i];
        high[i] = x[-1 + 2*i] - value;
    }

    low = malloc(sizeof(int)*(len>>1));

    for(i = 0; i < (len>>1); i++) 
    {
        value = high[i] + high[i + 1];
        low[i] = x[2*i] - value;
    }
    for (i=0; i<(len>>1); i++)
        in[i] = low[i];
        in[i+(len>>1)] = high[i+1];

    free(low);
    free(high);
}

int main{...}
...
int **2d_array;

...
#pragma omp parallel for
for(i = 0; i < height; i++){
    test_function(2d_array[i], width);
}

Regardless, the result is wrong. Also tried cilk_for instead of OpenMP pragmas. Is there a specific way to treat 2D arrays when each row is altered during each iteration?


Solution

  • The problems is that the variable x points to the same memory address for each thread. To fix this do something like

    x = malloc(sizeof(int)*(len+8));
    

    inside test_function.

    This is a common source of error using pointers with OpenMP. If you do &x you will see that each thread has a different memory address for x. So the pointers are actually private however when you do x = x_alloc + 4; the memory address each pointer points to is the same. This is effectively like having a shared array but you really want a private array for each thread. Therefore you have to allocate space for thread and set each private pointer to that private space.

    If you want to avoid allocating and deallocating x for each iteration you can try something like this

    #pragma omp parallel
    {
        int *x;
        x = malloc(sizeof(int)*(len+8));
        #pragma omp for
        for(i = 0; i < height; i++){
            test_function(2d_array[i], width, x);
        }
        free(x);
    }
    

    where test_funciton now requires x as an argument.