openglcompute-shaderopengl-4

Problem with imageStore in compute shader


I have a problem with a very simple compute shader that just copies a texture using imageStore.

#define KS 16 // kernel size
layout (local_size_x = KS, local_size_y = KS) in;

layout(location = 0) uniform sampler2D u_inputTex;
layout(location = 1) uniform writeonly image2D u_outImg;

void main()
{
    const ivec2 gid = ivec2(gl_WorkGroupID.xy);
    const ivec2 tid = ivec2(gl_LocalInvocationID.xy);
    const ivec2 pixelPos = ivec2(KS) * gid + tid;

    imageStore(u_outImg, pixelPos,
        uvec4(255.0 * texelFetch(u_inputTex, pixelPos, 0).rgb, 255u));
}

In the C++ side, I have this:

int w, h;
u32 inTex = -1;
{
    int nc;
    auto img = stbi_load("imgs/Windmill_NOAA.png", &w, &h, &nc, 3);
        
    if (img) {
        glGenTextures(1, &inTex);
        glBindTexture(GL_TEXTURE_2D, inTex);
        glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB8, w, h, 0, GL_RGB, GL_UNSIGNED_BYTE, img);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
        stbi_image_free(img);
    }
    else
        printf("Error loading img\n");
}

u32 outTex;
{
    glGenTextures(1, &outTex);
    glBindTexture(GL_TEXTURE_2D, outTex);
    glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8UI, w, h);
}

u32 compProg = easyCreateComputeShaderProg("compute", shader_srcs::computeSrc);
glUseProgram(compProg);

glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, inTex);
glBindImageTexture(0, outTex, 0, GL_FALSE, 0, GL_WRITE_ONLY, GL_RGBA8UI);
glUniform1i(0, 0);
glUniform1i(1, 0);

glDispatchCompute((w+15)/16, (h+15)/16, 1);

glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT); // make sure the output image has been written
    
u8* img = new u8[w * h * 4];
glBindTexture(GL_TEXTURE_2D, outTex);
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGBA_INTEGER, GL_UNSIGNED_BYTE, img);
stbi_write_png("imgs/out.png", w, h, 1, img, w*4);
delete[] img;

The input image looks like this:

enter image description here

But this is what I get in the output image:

enter image description here

I simplified the shader further: instead of reading from the input texture, I just write a fixed value:

    imageStore(u_outImg, pixelPos,
        //uvec4(255.0 * texelFetch(u_inputTex, pixelPos, 0).rgb, 255u));
        uvec4(1u));

I have noticed that:

I have also tried like this but didn't work either:

    imageStore(u_outImg, pixelPos,
        vec4(texelFetch(u_inputTex, pixelPos, 0).rgb, 255u));

What I'm doing wrong? My end goal is to make a prostprocessing filter but I couldn't get it to work, so I tried to make it as simple as possible and yet it doesn't work.

Minimal example repo: https://github.com/tuket/stackoverflow_image_store_problem


Solution

  • glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8UI, w, h);
    

    If you want to use an unnormalized unsigned integer image, you must declare it as uimage2D in the sahder. image2D is for floating-point or normalized integer (range [0,1]) only.

    glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
    

    That's the wrong memory barrier. The barrier is about how you're going to access the resources modified by your shader after the barrier, so the correct one is:

    GL_TEXTURE_UPDATE_BARRIER_BIT

    which is explained in the reference page as (emphasis mine):

    Writes to a texture via glTex(Sub)Image*, glCopyTex(Sub)Image*, glCompressedTex(Sub)Image*, and reads via glGetTexImage after the barrier will reflect data written by shaders prior to the barrier. Additionally, texture writes from these commands issued after the barrier will not execute until all shader writes initiated prior to the barrier complete.