It is so strange and I am struggling with this problem for the whole week. I just want to use the variable which is defined inside of the struct constructor, but fail to do that. The simple code is here:
#include <CL/sycl.hpp>
#include <fstream>
#include <cstdlib>
#include <stdio.h>
#include <stdlib.h>
#define ghost 3
using namespace cl::sycl;
struct test
{
int ls[3];
queue Q{};
test()
{
ls[0] = ghost;
ls[1] = ghost;
ls[2] = ghost;
}
void calculate();
};
void test::calculate()
{
size_t lx = 10;
size_t ly = 10;
auto abc = Q.submit([&](handler &h)
{
sycl::stream out(1024, 256, h);
h.parallel_for(range{lx, ly}, [=, lsq = this->ls](id<2> idx)
{ out << "this is id1" << lsq[1] << "\n"; }); });
}
int main()
{
test t1;
t1.calculate();
return 0;
}
Someone from the DPC++ community told me this method to capture this pointer, but I don't why it does not work well.
According to 4.12.4. Rules for parameter passing to kernels from SYCL 2020 Specification the array of scalar values can be passed as a kernel parameter. But the problem is in the capturing of struct member:
[lsq = this->ls]
is equivalent to
auto lsq = this->ls;
In this case, the type of lsq
is int*
and it will contain the address of test::ls
in the host memory. The access to the elements of the array in the kernel will lead to the undefined behavior.
There are two possible solutions here:
Create a local references to the test::ls
and pass it to the kernel by value:
void test::calculate() {
size_t lx = 10;
size_t ly = 10;
auto abc = Q.submit([&](handler &h) {
sycl::stream out(1024, 256, h);
auto& lsq = this->ls;
h.parallel_for(range{lx, ly}, [=](id<2> idx) {
out << "this is id1: " << lsq[1] << "\n";
});
});
}
In this case, the captured variable (lsq
) will have int[3]
type and will be correctly initialized in the kernel.
Use std::array
or sycl::marray
instead of C array:
#define ghost 3
using namespace cl::sycl;
struct test {
marray<int, 3> ls;
queue Q;
test() {
ls[0] = ls[1] = ls[2] = ghost;
}
void calculate() {
size_t lx = 10;
size_t ly = 10;
auto abc = Q.submit([&](handler& h) {
sycl::stream out(1024, 256, h);
h.parallel_for(range{ lx, ly }, [=, lsq = this->ls](id<2> idx) {
out << "this is id1: " << lsq[1] << "\n";
});
});
}
};