I use the OpenCL.NET C# wrapper for OpenCL.
My GPU from GPU-Z is AMD Radeon Barcelo, and specific for OpenCL:
Part of the code:
// probably useless
#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable
#pragma OPENCL EXTENSION cl_khr_global_int32_extended_atomics : enable
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
void vector_is_zero_partial(
uint row,
uint row_to,
__global const double *x,
double tolerance,
__global atomic_int *is_zero)
{
for (; row < row_to; ++row)
{
if (fabs(x[row]) > tolerance)
{
atomic_store(is_zero, 0);
break;
}
if (!atomic_load(is_zero)) break;
}
}
The error:
C:\Users\CHAMEL~1\AppData\Local\Temp\\OCL8036T0.cl:264:4: error: implicit declaration of function 'atomic_store' is invalid in C99
atomic_store(is_zero, 0);
^
C:\Users\CHAMEL~1\AppData\Local\Temp\\OCL8036T0.cl:267:8: error: implicit declaration of function 'atomic_load' is invalid in C99
if (!atomic_load(is_zero)) break;
^
2 errors generated.
error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR
So, atomic extensions exist, OpenCL is v2, BUT atomic_store / atomic_load does not exist.
Did I something wrong here?
atomic_load
requires the device features __opencl_c_atomic_order_seq_cst
and __opencl_c_atomic_scope_device
. As you are on the AMD-APP platform, it is possible these are not available. You could check clinfo
to be sure.
Two options that could be considered are:
void vector_is_zero_partial(
uint row,
uint row_to,
__global const double *x,
double tolerance,
__global atomic_int *is_zero)
{
for (; row < row_to; ++row)
{
if (fabs(x[row]) > tolerance)
{
atomic_xchg(is_zero, 0);//stores 0 to is_zero, returns is_zero
break;
}
if (!atomic_max(is_zero, 0)) break;
/*atomic_max will return is_zero, and store the
max value of is_zero, 0 to is_zero.
If is_zero = 1, is_zero will remain 1, but if
is_zero has been set to 0, it will remain 0.*/
}
}