c++kernelopenclopencl-c

does openCL support vectors as kernel arguments?


I've been thinking of ways to rewrite this code in openCL kernel form. It won't be particularly hard to convert (getting rid of glm types and bit masking), but the part I'm stuck on is how to pass the _triangles, _uvs, _indices, and _normals to the kernel. Is there any built-in functionality for vectors in openCL?

The only option I see, if there isn't any vector support, would be to pass 4 arrays of type float3 for each of the 3 variables I need to be returned (_triangles, _uvs, and _normals) and 2 arrays of float3 for _indices. Then in CPU convert the arrays back to vectors and shrink them to fit. I'm not so sure that passing so many memory buffers to the kernel is an efficient way to go through because that would be 14 arrays passed and returned from the kernel. Other solutions I have won't work when parallelized. Is there a way to simplify this solution, or better yet a purely better solution?

The function I have trouble with is _addRectangle and _createMesh is the function it will be combined within the kernel.

void Chunk::_addRectangle(glm::vec3 center, glm::vec3 height, glm::vec3 width, unsigned tex_num, cl_uint LOD)
{
    glm::vec3 corner1 = center - (height / 2.0) - (width / 2.0);
    glm::vec3 corner2 = center - (height / 2.0) + (width / 2.0);
    glm::vec3 corner3 = center + (height / 2.0) + (width / 2.0);
    glm::vec3 corner4 = center + (height / 2.0) - (width / 2.0);

    glm::vec3 normal = glm::cross(height, width);

    glm::vec2 uv1;
    glm::vec2 uv2;
    glm::vec2 uv3;
    glm::vec2 uv4;

    if (fabs(normal[1]) == 1.0)
    {
        uv1 = glm::vec2(1.0 / _tex_atlas_width, 1);
        uv2 = glm::vec2(1.0 / _tex_atlas_width, 0);
        uv3 = glm::vec2(0, 0);
        uv4 = glm::vec2(0, 1);
    }
    else
    {
        uv1 = glm::vec2(1.0 / _tex_atlas_width, 1);
        uv2 = glm::vec2(1.0 / _tex_atlas_width, 0);
        uv3 = glm::vec2(0, 0);
        uv4 = glm::vec2(0, 1);
    }

    float add = (1.0 / double(_tex_atlas_width)) * tex_num;
    uv1.x += add;
    uv2.x += add;
    uv3.x += add;
    uv4.x += add;

    // triangle 1
    _triangles.push_back(corner3);
    _triangles.push_back(corner2);
    _triangles.push_back(corner1);

    _normals.push_back(normal);
    _normals.push_back(normal);
    _normals.push_back(normal);

    _uvs.push_back(uv1);
    _uvs.push_back(uv2);
    _uvs.push_back(uv3);

    _indices.push_back(glm::ivec3(nrOfIndices + 0, nrOfIndices + 1, nrOfIndices + 2));

    // triangle 2 

    _triangles.push_back(corner4);
    _normals.push_back(normal);
    _uvs.push_back(uv4);


    _indices.push_back(glm::ivec3(nrOfIndices + 2, nrOfIndices + 3, nrOfIndices + 0));
    nrOfIndices += 4;

}

void Chunk::_createMesh(glm::ivec3 pos, int landmap_flags[96 * 96 * 96], cl_int LOD)
{
    std::byte* faces = new std::byte[chunkSize / LOD * chunkSize / LOD * chunkSize / LOD];

    int index = 0;

    // a index conversion from a single index array to a 3d array
    // landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] is

    for (int x = LOD; x < chunkSize + LOD; x += LOD) {
        for (int y = LOD; y < chunkSize + LOD; y += LOD) {
            for (int z = LOD; z < chunkSize + LOD; z += LOD) {
                x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD));
                faces[index] = (std::byte)0;
                if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                {
                    index++;
                    continue;
                }
                if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] != BLOCK::AIR)
                {
                    if (landmap_flags[(x - LOD) + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::South;
                    if (landmap_flags[(x + LOD) + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::North;
                    if (landmap_flags[x + (y - LOD) * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::Down;
                    if (landmap_flags[x + (y + LOD) * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::Up;
                    if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + (z - LOD) * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::West;
                    if (landmap_flags[x + y * (chunkSize + (2 * LOD)) + (z + LOD) * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))] == BLOCK::AIR)
                        faces[index] |= (std::byte)Direction::East;
                }

                if (faces[index] == (std::byte)0)
                    continue;

                if ((faces[index] & (std::byte)Direction::North) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2) + (float(LOD) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(0, 0, -LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::East) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2) + (float(LOD) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(LOD, 0, 0),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::South) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2) - (float(LOD) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(0, 0, LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::West) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2) - (float(LOD) / 2)),
                        glm::vec3(0, LOD, 0),
                        glm::vec3(-LOD, 0, 0),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::Up) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2) + (float(LOD) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(LOD, 0, 0),
                        glm::vec3(0, 0, LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                if ((faces[index] & (std::byte)Direction::Down) != (std::byte)0)
                {
                    _addRectangle(
                        glm::vec3( x + ((chunkSize - 1) * pos.x) - (chunkSize / 2) - (float(LOD - 1) / 2),
                                   y + ((chunkSize - 1) * pos.y) - (chunkSize / 2) - (float(LOD - 1) / 2) - (float(LOD) / 2),
                                   z + ((chunkSize - 1) * pos.z) - (chunkSize / 2) - (float(LOD - 1) / 2)),
                        glm::vec3(LOD, 0, 0),
                        glm::vec3(0, 0, -LOD),
                        landmap_flags[x + y * (chunkSize + (2 * LOD)) + z * (chunkSize + (2 * LOD)) * (chunkSize + (2 * LOD))],
                        LOD);
                }
                index++;
            }
        }
    }

    delete[]faces;
}

Thank you!

Edit: A possible more efficient way of storing the data would be in several float4 types. e.g.:

const uint n = get_global_id(0);

float4 triangles{1, 2, 3, 4}; // calculated values for each vertex

//(float4 list[size];) from constructor
list[n] = triangles;

Solution

  • In OpenCL there are vector types like float4, etc. More on this can be read here. There aren't containers like std::vector in c++ so the data must be passed using C-style arrays.

    Looking at the part of the code in the question, _triangles, _uvs, _indices, and _normals would be filled with results so the appropriate buffers would need to be allocated and passed to the kernel in order to store the results and read them back after kernel finishes its work.

    Passing 14 arrays shouldn't be an issue as long as the kernel is enough computationally intensive and looking at the code it might be as there are 2 nested loops. But it looks like a lot depends on how big are chunkSize and LOD variables. You would need to try it out and see how it performs.

    There shouldn't be any issue with copying data back to std::vector - just use memcpy.