I am currently learning shaders in OpenGL and finished writing my "drawText" geometry shader, so I can draw dynamic text ( content change every frame ), without recreating VBO every frame.
It's working nicely but it's limited to 28 chars, because of the GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS limitations that is equal 1024.
ATM I have 6 components per vertex emitted vec4 pos and vec2 texCoord. Which give me 1024/6 = 170 vertices to use for my triangle strip.
I need 6 vertices per char ( instead first and last char ) to display a quad per char and 2 vertices to move to next char with degenerated triangle. That gives me 170/6 = 28 chars.
So when I have a long text, I split it into text of 28 chars.
So now I try to optimize that and get my geometry shader to draw more than 28 chars. So because I am in 2D, I was trying to find a way to store the texCoord in the pos.zw for the fragment shader. and remove the out vec2 texCoord in my geometry shader. Which will make me emit only 4 components per vertex, which would bring me to 42 chars.
But reading the fragment shader doc and fragment systems input I don't see who to do this. So, is there a way to achieve that?
My code for reference
Vertex Shader
#version 330 core
layout (location = 0) in vec2 aPos;
uniform vec2 textPosition;
void main()
{
gl_Position = vec4(aPos ,0, 1) + vec4(textPosition, 0, 0);
}
Fragment Shader
#version 330 core
out vec4 fragColor;
in vec2 texCoord;
uniform vec4 textColor;
uniform sampler2D outTexture;
void main()
{
fragColor = texture(outTexture, texCoord) * textColor;
}
Geometry Shader
#version 330 core
layout (points) in;
layout (triangle_strip, max_vertices = 170) out;
// max components and vertices are 1024
// vec4 pos and vec2 text coord per vertex, that 6 components per vertex, 1024 / 6 = 170
out vec2 texCoord;
uniform float screenRatio = 1;
uniform float fontRatio = 1;
uniform float fontInterval = 0; // distance between letters
uniform float fontSize = 0.025f; // default value, screen coord range is -1f , 1f
uniform int textString[8]; // limited to 28 chars . 170 vertices / 6 = 28, 28 / 4 = 7 ints.
void main()
{
vec4 position = gl_in[0].gl_Position;
float fsx = fontSize * fontRatio * screenRatio;
float fsy = fontSize;
float tsy = 1.0f / 16.0f; // fixed in a 16x16 chars bitmap
float tsx = tsy;
float tw = tsx * fontRatio;
float to = ( tsx - tw ) * 0.5f;
vec4 ptl = position + vec4(0,0,0,0); // top left
vec4 ptr = position + vec4(fsx,0,0,0); // top right
vec4 pbl = position + vec4(0,fsy,0,0); // bottom left
vec4 pbr = position + vec4(fsx,fsy,0,0); // bottom right
vec2 tt; // tex coord top
vec2 tb; // tex coord bottom
fsx += fontInterval;
int i = 0; // index in int array
int si = 0; // sub index in int
int ti = textString[0];
int ch = 0;
do
{
// unpack a char, 4 chars per int
ch = (ti >> si) & (0xFF);
// string ends with \0 or end of array
if ( ch == 0 || i >= 8)
break;
// compute row and col of char in bitmaps 16x16 chars
int r = ch >> 4;
int c = ch - ( r << 4 );
// compute tex coord from row and column
tb = vec2(c * tsx + to, 1.0f - r * tsy);
tt = vec2(tb.x , tb.y - tsy);
texCoord = tt;
gl_Position = ptl;
EmitVertex();
EmitVertex();
texCoord = tb;
gl_Position = pbl;
EmitVertex();
tt.x += tw;
tb.x += tw;
texCoord = tt;
gl_Position = ptr;
EmitVertex();
texCoord = tb;
gl_Position = pbr;
EmitVertex();
EmitVertex();
// advance of 1 char
ptl.x += fsx;
ptr.x += fsx;
pbl.x += fsx;
pbr.x += fsx;
si += 8;
if ( si >= 32 )
{
si = 0;
++i;
ti = textString[i];
}
}
while ( true );
EndPrimitive();
}
The position of a vertex to be sent to the rasterizer, as defined through gl_Position
, contains 4 components. Always. And the meaning of those components is defined by the rasterizer and the OpenGL rendering system.
You cannot bypass or otherwise get around it. The output position has 4 components, and you cannot hide texture coordinates or other arbitrary data within them.
If you need to output more stuff from the GS, then you need to more efficiently use your GS's vertex output. As it currently stands, you output degenerate strips between each quad. This means that for every 6 vertices, only 4 of them are meaningful. You're using degenerate strips to split quads.
Instead of doing that, you should use EndPrimitive
to split your quads. That will remove 1/3rd of all of your vertex output, giving you more components to put to actual good use.