My question is: I'm trying to make a CUDA function call (cublasDgemm) and I'm getting an error because I'm accessing addresses that should be unnaccessible.
I think it is because the CUBLAS function isn't using the device variables, but the host ones.
I've seen that in OpenACC, you would use this:
#pragma acc host_data use_device(list of variables) {
(call to CUBLAS function)
}
host_data
makes you capable of bringing device's variable's addresses to the host, and use_device
makes whatever is inside the braces {}
use the variables in the device, not in the host. It can be consulted in more detail here -> https://www.openacc.org/sites/default/files/inline-files/OpenACC_2_0_specification.pdf
So, is there a way to replicate this in OpenMP? Do I have to do this? How do I make sure that the CUBLAS call is using the variables of the device?
Try:
#pragma omp target data use_device_ptr(list of variables)
{
call to cuda(vars)
}
See slide 27 of: https://on-demand.gputechconf.com/gtc/2018/presentation/s8344-openmp-on-gpus-first-experiences-and-best-practices.pdf