I am trying to implement SSAO after OGLDev Tutorial 45, which is based on a Tutorial by John Chapman. The OGLDev Tutorial uses a highly simplified method which samples random points in a radius around the fragment position and steps up the AO factor depending on how many of the sampled points have a depth greater than the actual surface depth stored at that location (the more positions around the fragment lie in front of it the greater the occlusion).
The 'engine' i use does not have as modular deferred shading as OGLDev, but basically it first renders the whole screen colors to a framebuffer with a texture attachment and a depth renderbuffer attachment. To compare the depths, the fragment view space positions are rendered to another framebuffer with texture attachment. Those texture are then postprocessed by the SSAO shader and the result is drawn to a screen filling quad. Both textures on their own draw fine to the quad and the shader input uniforms seem to be ok also, so thats why i havent included any engine code.
The Fragment Shader is almost identical, as you can see below. I have included some comments that serve my personal understanding.
#version 330 core
in vec2 texCoord;
layout(location = 0) out vec4 outColor;
const int RANDOM_VECTOR_ARRAY_MAX_SIZE = 128; // reference uses 64
const float SAMPLE_RADIUS = 1.5f; // TODO: play with this value, reference uses 1.5
uniform sampler2D screenColorTexture; // the whole rendered screen
uniform sampler2D viewPosTexture; // interpolated vertex positions in view space
uniform mat4 projMat;
// we use a uniform buffer object for better performance
layout (std140) uniform RandomVectors
{
vec3 randomVectors[RANDOM_VECTOR_ARRAY_MAX_SIZE];
};
void main()
{
vec4 screenColor = texture(screenColorTexture, texCoord).rgba;
vec3 viewPos = texture(viewPosTexture, texCoord).xyz;
float AO = 0.0;
// sample random points to compare depths around the view space position.
// the more sampled points lie in front of the actual depth at the sampled position,
// the higher the probability of the surface point to be occluded.
for (int i = 0; i < RANDOM_VECTOR_ARRAY_MAX_SIZE; ++i) {
// take a random sample point.
vec3 samplePos = viewPos + randomVectors[i];
// project sample point onto near clipping plane
// to find the depth value (i.e. actual surface geometry)
// at the given view space position for which to compare depth
vec4 offset = vec4(samplePos, 1.0);
offset = projMat * offset; // project onto near clipping plane
offset.xy /= offset.w; // perform perspective divide
offset.xy = offset.xy * 0.5 + vec2(0.5); // transform to [0,1] range
float sampleActualSurfaceDepth = texture(viewPosTexture, offset.xy).z;
// compare depth of random sampled point to actual depth at sampled xy position:
// the function step(edge, value) returns 1 if value > edge, else 0
// thus if the random sampled point's depth is greater (lies behind) of the actual surface depth at that point,
// the probability of occlusion increases.
// note: if the actual depth at the sampled position is too far off from the depth at the fragment position,
// i.e. the surface has a sharp ridge/crevice, it doesnt add to the occlusion, to avoid artifacts.
if (abs(viewPos.z - sampleActualSurfaceDepth) < SAMPLE_RADIUS) {
AO += step(sampleActualSurfaceDepth, samplePos.z);
}
}
// normalize the ratio of sampled points lying behind the surface to a probability in [0,1]
// the occlusion factor should make the color darker, not lighter, so we invert it.
AO = 1.0 - AO / float(RANDOM_VECTOR_ARRAY_MAX_SIZE);
///
outColor = screenColor + mix(vec4(0.2), vec4(pow(AO, 2.0)), 1.0);
/*/
outColor = vec4(viewPos, 1); // DEBUG: draw view space positions
//*/
}
vec2 texCoord = gl_FragCoord.xy / textureSize(screenColorTexture, 0);
When i set the AO mixing factor at the bottom of the fragment shader to 0, it runs smooth to the fps cap (even though the calculations are still performed, at least i guess the compiler wont optimize that :D ). But when the AO is mixed in it takes up to 80 ms per frame draw (getting slower with time, as if the buffers were filling up), and the result is really interesting and confusing:
Obviously the mapping seems far off, and the flickering noise seems very random, as if it corresponded directly to the random sample vectors. I found it most interesting that the draw time increased massively only on the addition of the AO factor, not due to the occlusion calculation. Is there an issue in the draw buffers?
The issue appeared to be linked to the chosen texture types.
The texture with handle viewPosTexture
needed to explicitly be defined as a float texture format GL_RGB16F
or GL_RGBA32F
, instead of just GL_RGB
. Interestingly, the seperate textures were drawn fine, the issues arised in combination only.
// generate screen color texture
// note: GL_NEAREST interpolation is ok since there is no subpixel sampling anyway
glGenTextures(1, &screenColorTexture);
glBindTexture(GL_TEXTURE_2D, screenColorTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, windowWidth, windowHeight, 0, GL_BGR, GL_UNSIGNED_BYTE, NULL);
// generate depth renderbuffer. without this, depth testing wont work.
// we use a renderbuffer since we wont have to sample this, opengl uses it directly.
glGenRenderbuffers(1, &screenDepthBuffer);
glBindRenderbuffer(GL_RENDERBUFFER, screenDepthBuffer);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT, windowWidth, windowHeight);
// generate vertex view space position texture
glGenTextures(1, &viewPosTexture);
glBindTexture(GL_TEXTURE_2D, viewPosTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, windowWidth, windowHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL);
The slow drawing might be caused by the GLSL mix function. Will investigate further on that.
The flickering was due to the regeneration and passing of new random vectors in each frame. Just passing enough random vectors once solves the issue. Otherwise it might help to blur the SSAO result.
Basically, the SSAO works now! Now its just more or less apparent bugs.