What use has the scalar
layout specifier when accessing a storage buffer in GL_EXT_scalar_block_layout? (see below for example)
What would be use case for scalar
?
I recently programmed a simple Raytracer using Vulkan and NVidias VkRayTracing extension and was following this tutorial. In the section about the closest hit shader it is required to access some data that's stored in, well storage buffers (with usage flags vk::BufferUsageFlagBits::eStorageBuffer
).
In the shader the extension GL_EXT_scalar_block_layout
is used and those buffers are accessed like this:
layout(binding = 4, set = 1, scalar) buffer Vertices { Vertex v[]; } vertices[];
When I first used this code the validation layers told me that the structs like Vertex
had an invalid layout, so I changed them to have each member aligned on 16byte blocks:
struct Vertex {
vec4 position;
vec4 normal;
vec4 texCoord;
};
with the corresponding struct in C++:
#pragma pack(push, 1)
struct Vertex {
glm::vec4 position_1unused;
glm::vec4 normal_1unused;
glm::vec4 texCoord_2unused;
};
#pragma pack(pop)
Errors disappeared and I got a working Raytracer. But I still don't understand why the scalar
keyword is used here. I found this document talking about the GL_EXT_scalar_block_layout-extension, but I really don't understand it. Probably I'm just not used to glsl terminology? I can't see any reason why I would have to use this.
Also I just tried to remove the scalar
and it still worked without any difference, warnings or erros whatsoever. Would be grateful for any clarification or further resources on this topic.
The std140
and std430
layouts do quite a bit of rounding of the offsets/alignments sizes of objects. std140
basically makes any non-scalar type aligned to the same alignment as a vec4
. std430
relaxes that somewhat, but it still does a lot of rounding up to a vec4
's alignment.
scalar
layout means basically to layout the objects in accord with their component scalars. Anything that aggregates components (vectors, matrices, arrays, and struct
s) does not affect layout. In particular:
All types are sized/aligned only to the highest alignment of the scalar components that they actually use. So a struct containing a single uint
is sized/aligned to the same size/alignment as a uint
: 4 bytes. Under std140
rules, it would have 16-byte size and alignment.
Note that this layout makes vec3
and similar types actually viable, because C and C++ would then be capable of creating alignment rules that map to those of GLSL.
The array stride of elements in the array is based solely on the size/alignment of the element type, recursively. So an array of uint
has an array stride of 4 bytes; under std140
rules, it would have a 16-byte stride.
Alignment and padding only matter for scalars. If you have a struct containing a uint
followed by a uvec2
, in std140/430
, this will require 16 bytes, with 4 bytes of padding after the first uint
. Under scalar
layout, such a struct only takes 12 bytes (and is aligned to 4 bytes), with the uvec2
being conceptually misaligned. Padding therefore only exists if you have smaller scalars, like a uint16
followed by a uint
.
In the specific case you showed, scalar
layout was unnecessary since all of the types you used are vec4
s.