I'm trying to define a global array of structs inside opencl kernel.
I get some data from the host, and trying to assign it to that structure, but when I print the result structure, I get a lot of racing values when I printf the values.
The struct that I define
typedef struct tag_triangle {
float3 p0;
float3 p1;
float3 p2;
} __attribute__((aligned(128))) UserStruct;
struct tag_triangle primitives[12];
Here I get an array of floats called vbo, and I assign them to the triangle primitive.
__kernel void VertexShaderKernel(__global float* vbo, int vbosize
)
{
size_t threadIdX = get_local_id(0);
size_t threadIdY = get_local_id(1);
size_t blockIdx = get_group_id(0);
size_t blockIdy = get_group_id(1);
size_t blockDimX = get_local_size(0);
size_t blockDimY = get_local_size(1);
int i = (blockIdx * blockDimY) + threadIdX;
primitives[0].p0 = (float3)(vbo[0], vbo[1], vbo[2]);
primitives[0].p1 = (float3)(vbo[3], vbo[4], vbo[5]);
primitives[0].p2 = (float3)(vbo[6], vbo[7], vbo[6]);
printf(" %f, %f , %f \n\r ", primitives[0].p0.x, primitives[0].p0.y, primitives[0].p0.z);
printf(" %f, %f , %f \n\r ", primitives[0].p1.x, primitives[0].p1.y, primitives[0].p1.z);
printf(" %f, %f , %f\n\r ", primitives[0].p2.x, primitives[0].p2.y, primitives[0].p2.z);
// racing
}
This is expected behaviour. The kernel is executed in parallel for all work items, and for each one you have 3 printf calls. Hence they all appear in the console in random order. If you want only a single printf output, use a condition to keep all but one thread quiet:
if(i==0) {
printf(...);
printf(...);
printf(...);
}