semanticshlsldxc

Custom HLSL Structure


Is it possible to replace built-in packed variables like float4 or int4 with custom HLSL structures without changing the functionality of the data? For example, instead of using a uint4 (x,y,z,w), using something like this structure:

struct customData
{
    uint        Rough;
    uint        Metal;
    uint        AO;
    uint        Spec;
};

The only purpose of changing this from a uint4 would be readability, and keeping the data straight (remembering that z is AO). If I try to sneak this into my vertex input, for example:

struct vertexData
{
    float4      Position    : ATTRIB0;
    float4      Normal      : ATTRIB1;
    customData  Custom      : ATTRIB2;
    float4      TexCoord    : ATTRIB3;
};

..the dxc compiler throws an error about the attributes overlapping when I try to do this. From some limited testing, it looks like the compiler is automatically applying semantics in sequence like this:

struct customData
{
    uint        Rough   : ATTRIB2; // ATTRIB2+0
    uint        Metal   : ATTRIB3; // ATTRIB2+1 (overlap with TexCoord)
    uint        AO      : ATTRIB4; // ATTRIB2+2
    uint        Spec    : ATTRIB5; // ATTRIB2+3
};

Is it possible to prevent this from happening? Or just any way to place such data into named variables without it changing the structure or order of things? Additionally, I'm not fluent in HLSL, so I'm not entirely sure what the eventual runtime difference is between 4 uint variables with unique semantics and a single uint4 with only one semantic? Is there any difference?

I appreciate any insights or advice!


Solution

  • For readability, you can keep these numbers in uint4 and convert to/from your custom structure with an inline function. Pretty sure the compiler will emit equivalent code compared to less readable version which directly handles uint4 values.

    what the eventual runtime difference is between 4 uint variables with unique semantics and a single uint4 with only one semantic? Is there any difference?

    I don’t know, but I think the runtime difference, if any, is specific to GPU model and even driver version. GPU drivers include yet another shader compiler, a just-in-time one, which transforms DXBC byte codes produced by Microsoft-made fxc.exe, or some other byte codes produced by the newer dxc.exe, into actual hardware instructions for your specific GPU model.

    Also, note that there’s a hard limit on the count of these semantics, and that limit is 32 which is rather small number. For D3D11, see the D3D11_VS_INPUT_REGISTER_COUNT, D3D11_VS_OUTPUT_REGISTER_COUNT, D3D11_GS_OUTPUT_REGISTER_COUNT numbers on that table. The same limit applies to D3D12 as well, see D3D12_DS_OUTPUT_REGISTER_COUNT, D3D12_VS_OUTPUT_REGISTER_COUNT, etc., on that table.