I have kernel which evaluate interaction between all pairs of neighbors of an atoms. Each atom has max. 4 neighbors so I store their indexes in int4
. But in order to loop over these neighbors I need to access them by index (neighs[0]
rather than neighs.x
).
The loop should look something like:
int iatom = get_global_id(0);
int4 ng = neighs[iatom]; // each atoms has 4 neighbors
float4 p0 = atom_pos[iatom];
float4 force = (float)(0.f,0.f,0.f,0.f);
for(int i=0; i<4; i++){
int ing = ng[i]; // HERE: index into vector
float4 pi = atom_pos[ing];
for(int j=i+1; j<4; j++){
int jng = ng[j]; // HERE: index into vector
float4 pj = atom_pos[jng];
force += evalInteraction( p0, pi, pj );
}
}
forces[iatom]=force;
I have some idea how it can be probably done but not sure:
int4
to int*
int4 ng_ = neighs[iatom]; // make sure we copy it to local memory or register
int* ng = (int*)&ng_; // pointer to local memory can be optimized out, right ?
for(int i=0; i<4; i++){
int ing = ng[i];
...
}
You can cast directly, but you can also declare a union for easier access:
union
{
int components[4];
int4 vector;
} neighbors;
neighbors.vector = ng;
neighbors.components[i]; // Works now