amd-rocmhip

HIP Device Synchronization


I'm copying data to the GPU memory using hipMemcpyAsync and then waiting for the device to complete the data transfer with hipDeviceSynchronize. The description of the hipDeviceSynchronize function here states that "the host thread gets blocked until all the commands associated with streams associated with the device." Observing the CPU utilization, I believe the host thread begins busy-waiting inside this function until the device completes its tasks. Is there any other function I can use or a variable I can set to modify this behavior to interrupt-based blocking, where the host thread is actually blocked by the OS and later woken up by an interrupt from the device?


Solution

  • I figured it out myself. All needed to be done was to add the following.

    hipSetDeviceFlags(hipDeviceScheduleBlockingSync)