cclsyclintel-oneapidpc++intel-ccl

buffers in CCL code samples along with the oneapi toolkit


I Was going through the CCL code samples along with the oneapi toolkit. In the below DPC++(SYCL) code initially sendbuf a buffer is created in the cpu side and is not initialised and in the part where offloading to target device takes place the dev_acc_sbuf[id] variable, which is a variable in the kernel scope is modified. This variable(dev_acc_sbuf) is not hence used in the program neither is its value copied back to sendbuf.Then in the next line the sendbuf variable is used for allreduce. I am not able to understand how changing the dev_acc_sbuf makes change in the sendbuf.

          cl::sycl::queue q;
cl::sycl::buffer<int, 1> sendbuf(COUNT);
          /* open sendbuf and modify it on the target device side */
q.submit([&](cl::sycl::handler& cgh) {
   auto dev_acc_sbuf = sendbuf.get_access<mode::write>(cgh);
   cgh.parallel_for<class allreduce_test_sbuf_modify>(range<1>{COUNT}, [=](item<1> id) {
       dev_acc_sbuf[id] += 1;
   });
});
/* invoke ccl_allreduce on the CPU side */
ccl_allreduce(&sendbuf,
              &recvbuf,
              COUNT,
              ccl_dtype_int,
              ccl_reduction_sum,
              NULL,
              NULL,
              stream,
              &request);

Solution

  • In the line "auto dev_acc_sbuf = sendbuf.get_access<mode::write>(cgh);" the dev_acc_sbuf is a handle that accesses sendbuf and not a seperate buffer. The changes made in the dev_acc_sbuf handle gets reflected to the original buffer ie the sendbuffer . This is an advantage in SYCL as the changes made in the kernel scope is automatically copied back to the original variable