c++parallel-processingtranslateopenaccsycl

How do I translate this simple OpenACC code to SYCL?


I have this code:

#pragma acc kernels
#pragma acc loop seq
for(i=0; i<bands; i++)
{
    mean=0;

    #pragma acc loop seq
    for(j=0; j<N; j++)
        mean+=(image[(i*N)+j]);

    mean/=N;
    meanSpect[i]=mean;

    #pragma acc loop
    for(j=0; j<N; j++)
        image[(i*N)+j]=image[(i*N)+j]-mean;
}

As you can see, the first loop is told to be executed in sequence / single thread mode, the first loop inside too, but the last one can be parallelized so I do that.

My question is, how do I translate this to SYCL? Do I put everything inside one q.submit() and then inside create a parallel_for() only for the parallel region? Would that be possible (and correct)?

Second question, the above code continues as follows:

#pragma acc parallel loop collapse(2)
for(j=0; j<bands; j++)
    for(i=0; i<bands; i++)
        Corr[(i*bands)+j] = Cov[(i*bands)+j]+(meanSpect[i] * meanSpect[j]);

How do I indicate the collapse() tag in SYCL? Does it exist or do I have to program it in other way?

Thank you very much in advance.


Solution

  • In case anyone sees this, here's the correct answer:

    First code:

    for(i=0; i<bands; i++)
    {
      mean=0;
    
      for(j=0; j<N; j++)
         mean+=(image[(i*N)+j]);
    
      mean/=N;
      meanSpect[i]=mean;
    
      q.submit([&](auto &h) {
        h.parallel_for(range(N), [=](auto j) {
           image[(i*N)+j]=image[(i*N)+j]-mean;
        });
      }).wait();
    }
    

    Second code:

        q.submit([&](auto &h) {
           h.parallel_for(range<2>(bands_sycl,bands_sycl), [=](auto index) {
             int i = index[1];
             int j = index[0];
             Corr[(i*bands)+j] = Cov[(i*bands)+j]+(meanSpect[i] * meanSpect[j]);
           });
        }).wait();