In OpenCL
, are there any performance benefits to flagging buffers as READ_ONLY
or WRITE_ONLY
?
This kernel
is what I often see (a is READ_ONLY
and b is WRITE_ONLY
):
__kernel void two_buffer_double(__global float* a, __global float* b)
{
int i = get_global_id(0);
b[i] = a[i] * 2;
}
This kernel
seems better, because it uses less global memory (a is READ_WRITE
):
__kernel void one_buffer_double(__global float* a)
{
int i = get_global_id(0);
a[i] = a[i] * 2;
}
Do READ_ONLY
and WRITE_ONLY
flags just exist to help with debugging and catching errors?
To answer straight forward to your question I'd say: No, these flags do not just exist to help with debugging and catching errors. However it's hard to give any reference on how these flags are used by any implementation and how they impact the performances.
My understanding (unfortunately not backed up by any documentation) is that when using these flags you put more constraints on how the buffers will be used and therefore you can help the runtime/driver/compiler to make some assumptions that might improve the performances. For instance I imagine that there should be no worries about memory consistency with a read only buffer while a kernel is using it since the workitems are not supposed to write in it. Therefore some checks could be skipped...though in Opencl you are suppose to take care of this yourself using barriers and so on.
Note also that since Opencl 1.2 some other flags have been introduced related this time to how the host needs to access the buffers. There are:
CL_MEM_HOST_NO_ACCESS,
CL_MEM_HOST_{READ, WRITE}_ONLY,
CL_MEM_{USE, ALLOC, COPY}_HOST_PTR
I'm guessing that again it must help the people implementing opencl to enhance performance, but I guess we'd need the input from some AMD or NVIDIA experts.
Please note that all I said so far are only my thoughts and are not based on any serious documentation (I didn't manage to find any).
On the other hand I can tell you for sure that the standard does not forced a read only buffer to be in the constant space as @Quonux stated. It might be that some implementations do this for small buffer. Let's not forget that the constant space memory is small so you can have read only buffer too large to fit in. The only way to make sure that a buffer is in the constant space memory is to use the constant key word in your kernel code as explained here. Of course in the host side, if you want to use constant buffer you have to use the read only flag.