directxdirectx-11compute-shadervertex-buffer

How to make the vertex buffer available to a compute shader


I have a C++ program that uses Direct3D to draw a 3D model. That program works as expected, i.e. images of the 3D model are properly rendered. The next step is to write a compute shader that takes the vertex and index buffer as input. It then does some calculations with that data when the user executes a certain workflow.

To enable this, I need to make the vertex and index buffer available to the compute shader. This is where things go wrong, i.e. as soon as I change the code so that the buffers could be used by the compute shader the program does not render images anymore.

The 3D model is made up of vertices of the following type:

struct VertexWithNormal
{
  ::DirectX::XMFLOAT4 coordinates;
  ::DirectX::XMFLOAT4 normal;
  int                 whatever1;
  int                 whatever2;
  int                 whatever3;
  int                 whatever4;
};

This layout is then made available to the GPU by the following code:

static const D3D11_INPUT_ELEMENT_DESC vertexDesc[] =
{
  { "POSITION" , 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, offsetof(VertexWithNormal, coordinates), D3D11_INPUT_PER_VERTEX_DATA, 0 },
  { "NORMAL"   , 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, offsetof(VertexWithNormal, normal)     , D3D11_INPUT_PER_VERTEX_DATA, 0 },
  { "WHATEVER1", 0, DXGI_FORMAT_R32_SINT          , 0, offsetof(VertexWithNormal, whatever1)  , D3D11_INPUT_PER_VERTEX_DATA, 0 },
  { "WHATEVER2", 0, DXGI_FORMAT_R32_SINT          , 0, offsetof(VertexWithNormal, whatever2)  , D3D11_INPUT_PER_VERTEX_DATA, 0 },
  { "WHATEVER3", 0, DXGI_FORMAT_R32_SINT          , 0, offsetof(VertexWithNormal, whatever3)  , D3D11_INPUT_PER_VERTEX_DATA, 0 },
  { "WHATEVER4", 0, DXGI_FORMAT_R32_SINT          , 0, offsetof(VertexWithNormal, whatever4)  , D3D11_INPUT_PER_VERTEX_DATA, 0 },
};

winrt::check_hresult(_d3dDevice->CreateInputLayout(vertexDesc, ARRAYSIZE(vertexDesc), shaderCode, shaderSize, _inputLayout.put()));
_d3dContext->IASetInputLayout(_inputLayout.get());

The vertex buffer is then created by the following code:

UINT stride = sizeof(VertexWithNormal);
UINT offset = 0;

D3D11_SUBRESOURCE_DATA vertexBufferData{};
vertexBufferData.pSysMem = _model->GetVertexPointer();

D3D11_BUFFER_DESC vertexBufferDesc{};
vertexBufferDesc.Usage               = D3D11_USAGE_DEFAULT;
vertexBufferDesc.BindFlags           = D3D11_BIND_VERTEX_BUFFER;
vertexBufferDesc.ByteWidth           = 1234;
vertexBufferDesc.StructureByteStride = stride;

winrt::check_hresult(_d3dDevice->CreateBuffer(&vertexBufferDesc, &vertexBufferData, _vertexBuffer.put()));

ID3D11Buffer* vertexBuffer = _vertexBuffer.get();

_d3dContext->IASetVertexBuffers(0, 1, &vertexBuffer, &stride, &offset);

Finally, the following code creates the index buffer which is just made up of unsigned int:

D3D11_SUBRESOURCE_DATA indexBufferData{};
indexBufferData.pSysMem = _model->GetIndexPointer();

D3D11_BUFFER_DESC indexBufferDesc{};
indexBufferDesc.Usage     = D3D11_USAGE_DEFAULT;
indexBufferDesc.BindFlags = D3D11_BIND_INDEX_BUFFER;
indexBufferDesc.ByteWidth = 4321;

winrt::check_hresult(_d3dDevice->CreateBuffer(&indexBufferDesc, &indexBufferData, _indexBuffer.put()));

_d3dContext->IASetIndexBuffer(_indexBuffer.get(), DXGI_FORMAT_R32_UINT, 0);

Now, the HLSL code of the compute shader is supposed to look like this:

struct Vertex
{
  float4 Coordinates;
  float4 Normal;
  int    Whatever1;
  int    Whatever2;
  int    Whatever3;
  int    Whatever4;
};

StructuredBuffer<Vertex> VertexBuffer : register(t0);
StructuredBuffer<uint>   IndexBuffer  : register(t1);;

[numthreads(128, 1, 1)]
void main(uint3 dispatchThreadId : SV_DispatchThreadID)
{
  uint vertexIndex = dispatchThreadId.x * 3;

  // Retrieve the actual values from the vertex and index buffers
  Vertex vertex0 = VertexBuffer[IndexBuffer[vertexIndex]];
  Vertex vertex1 = VertexBuffer[IndexBuffer[vertexIndex + 1u]];
  Vertex vertex2 = VertexBuffer[IndexBuffer[vertexIndex + 2u]];

  // Calculations with the vertices...
}

My understanding is that the vertex and index buffers will need to be mapped to StructuredBuffer in the HLSL code. Apparently that is a read-only buffer from the perspective of the compute shader which would be what I need. I also tried to map those buffers as constant buffer (cbuffer) but was unsuccessful with the attempt.

Now, my understanding is that I need to create a ID3D11ShaderResourceView for both the vertex and index buffer so that I can make them available to the shader by calling CSSetShaderResources of the Direct3D context. I tried to do so by changing the code that creates the vertex buffer as follows:

UINT stride = sizeof(VertexWithNormal);
UINT offset = 0;

D3D11_SUBRESOURCE_DATA vertexBufferData{};
vertexBufferData.pSysMem = _model->GetVertexPointer();

D3D11_BUFFER_DESC vertexBufferDesc{};
vertexBufferDesc.Usage               = D3D11_USAGE_DEFAULT;
// Compared to the working code, I'm ORing D3D11_BIND_SHADER_RESOURCE to D3D11_BIND_VERTEX_BUFFER
vertexBufferDesc.BindFlags           = D3D11_BIND_VERTEX_BUFFER | D3D11_BIND_SHADER_RESOURCE;
// Seems to be necessary, but adding the below line results in empty images, nothing is rendered anymore
vertexBufferDesc.MiscFlags           = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
vertexBufferDesc.ByteWidth           = 1234;
vertexBufferDesc.StructureByteStride = stride;

winrt::check_hresult(_d3dDevice->CreateBuffer(&vertexBufferDesc, &vertexBufferData, _vertexBuffer.put()));

ID3D11Buffer* vertexBuffer = _vertexBuffer.get();

_d3dContext->IASetVertexBuffers(0, 1, &vertexBuffer, &stride, &offset);

// Code that seems to be necessary to make the vertex buffer available to the compute shader
D3D11_SHADER_RESOURCE_VIEW_DESC computeShaderResourceDesc{};
computeShaderResourceDesc.Format               = DXGI_FORMAT_UNKNOWN;
computeShaderResourceDesc.ViewDimension        = D3D11_SRV_DIMENSION_BUFFEREX;
computeShaderResourceDesc.BufferEx.NumElements = <actual size...>;

winrt::check_hresult(_d3dDevice->CreateShaderResourceView(vertexBuffer, &computeShaderResourceDesc, _vertexBufferShaderResourceView.put()));

ID3D11ShaderResourceView* vertexBufferShaderResourceView = _vertexBufferShaderResourceView.get();

_d3dContext->CSSetShaderResources(0, 1, &vertexBufferShaderResourceView);

However, modifying the code as shown above breaks rendering, i.e. there are no more images of the 3D model being rendered.

  1. Is it even possible to use the same vertex and index buffer to render images and to read them from a compute shader?
  2. If the answer to 1 is yes, what am I doing wrong?

Edit:

Enabling the DirectX debug layer results in the following errors being emitted when I run my program:

D3D11 ERROR: ID3D11Device::CreateBuffer: Buffers created with D3D11_RESOURCE_MISC_BUFFER_STRUCTURED cannot specify any of the following listed bind flags.  The following BindFlags bits (0x9) are set: D3D11_BIND_VERTEX_BUFFER (1), D3D11_BIND_INDEX_BUFFER (0), D3D11_BIND_CONSTANT_BUFFER (0), D3D11_BIND_STREAM_OUTPUT (0), D3D11_BIND_RENDER_TARGET (0), or D3D11_BIND_DEPTH_STENCIL (0). [ STATE_CREATION ERROR #68: CREATEBUFFER_INVALIDMISCFLAGS]
D3D11 ERROR: ID3D11Device::CreateBuffer: CreateBuffer returning E_INVALIDARG, meaning invalid parameters were passed. [ STATE_CREATION ERROR #69: CREATEBUFFER_INVALIDARG_RETURN]

Removing vertexBufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; then emits the following errors:

D3D11 ERROR: ID3D11Device::CreateShaderResourceView: The Format (0, UNKNOWN) cannot be used, when creating a View of a Buffer. [ STATE_CREATION ERROR #127: CREATESHADERRESOURCEVIEW_INVALIDFORMAT]
D3D11 ERROR: ID3D11Device::CreateShaderResourceView: Returning E_INVALIDARG, meaning invalid parameters were passed. [ STATE_CREATION ERROR #131: CREATESHADERRESOURCEVIEW_INVALIDARG_RETURN]

Now the question is, what needs to be put there as format. No idea at this point.


Solution

  • The only way that I found and the way hinted in the comments to access the vertex and index buffers from a DirectX 11 compute shader is to bind them as ByteAddressBuffer. This is not very convenient because, as the name implies, all read operations are done via address expressed as a byte offset and it becomes cumbersome really quickly to get values out of the buffer.

    Because the ByteAddressBuffer is a read-only buffer from the perspective of the shader, it is necessary to create a Shader Resource View (SRV) to be able to bind the buffer to the shader. To achieve this, I had to modify the code that I posted in the question as outlined below.

    First, the vertex buffer must be created with additional flags (misc flag D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS and bind flag D3D11_BIND_SHADER_RESOURCE). After that, the SRV for it must be created which then enables binding it to the shader:

    UINT stride = sizeof(VertexWithNormal);
    UINT offset = 0;
    
    D3D11_SUBRESOURCE_DATA vertexBufferData{};
    vertexBufferData.pSysMem = _model->GetVertexPointer();
    
    D3D11_BUFFER_DESC vertexBufferDesc{};
    vertexBufferDesc.Usage               = D3D11_USAGE_DEFAULT;
    vertexBufferDesc.BindFlags           = D3D11_BIND_VERTEX_BUFFER | D3D11_BIND_SHADER_RESOURCE;
    vertexBufferDesc.MiscFlags           = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS;
    vertexBufferDesc.ByteWidth           = 1234;
    vertexBufferDesc.StructureByteStride = stride;
    
    winrt::check_hresult(_d3dDevice->CreateBuffer(&vertexBufferDesc, &vertexBufferData, _vertexBuffer.put()));
    
    ID3D11Buffer* vertexBuffer = _vertexBuffer.get();
    
    _d3dContext->IASetVertexBuffers(0, 1, &vertexBuffer, &stride, &offset);
    
    D3D11_SHADER_RESOURCE_VIEW_DESC shaderResourceViewDesc{};
    shaderResourceViewDesc.Format               = DXGI_FORMAT_R32_TYPELESS;
    shaderResourceViewDesc.ViewDimension        = D3D11_SRV_DIMENSION_BUFFEREX;
    shaderResourceViewDesc.BufferEx.Flags       = D3D11_BUFFEREX_SRV_FLAG_RAW;
    shaderResourceViewDesc.BufferEx.NumElements = 1234 / 4; // The size of the buffer here seems to be specified in 4-byte elements, thus divide by 4
    
    winrt::check_hresult(_d3dDevice->CreateShaderResourceView(vertexBuffer, &shaderResourceViewDesc, _vertexBufferShaderResourceView.put()));
    
    ID3D11ShaderResourceView* vertexBufferShaderResourceView = _vertexBufferShaderResourceView.get();
    
    _d3dContext->CSSetShaderResources(0, 1, &vertexBufferShaderResourceView);
    

    The index buffer creation code must be modified in the same manner:

    D3D11_SUBRESOURCE_DATA indexBufferData{};
    indexBufferData.pSysMem = _model->GetIndexPointer();
    
    D3D11_BUFFER_DESC indexBufferDesc{};
    indexBufferDesc.Usage     = D3D11_USAGE_DEFAULT;
    indexBufferDesc.BindFlags = D3D11_BIND_INDEX_BUFFER | D3D11_BIND_SHADER_RESOURCE;
    indexBufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS;
    indexBufferDesc.ByteWidth = 4321;
    
    winrt::check_hresult(_d3dDevice->CreateBuffer(&indexBufferDesc, &indexBufferData, _indexBuffer.put()));
    
    _d3dContext->IASetIndexBuffer(_indexBuffer.get(), DXGI_FORMAT_R32_UINT, 0);
    
    D3D11_SHADER_RESOURCE_VIEW_DESC shaderResourceViewDesc{};
    shaderResourceViewDesc.Format               = DXGI_FORMAT_R32_TYPELESS;
    shaderResourceViewDesc.ViewDimension        = D3D11_SRV_DIMENSION_BUFFEREX;
    shaderResourceViewDesc.BufferEx.Flags       = D3D11_BUFFEREX_SRV_FLAG_RAW;
    shaderResourceViewDesc.BufferEx.NumElements = 4321 / 4; // The size of the buffer here seems to be specified in 4-byte elements, thus divide by 4
    
    winrt::check_hresult(_d3dDevice->CreateShaderResourceView(_indexBuffer.get(), &shaderResourceViewDesc, _indexBufferShaderResourceView.put()));
    
    ID3D11ShaderResourceView* indexBufferShaderResourceView = _indexBufferShaderResourceView.get();
    
    _d3dContext->CSSetShaderResources(1, 1, &indexBufferShaderResourceView);
    

    Because the shader now needs to access elements with byte offsets, the Vertex struct in the HLSL code is no longer usable and thus removed. The modified shader code looks as follows:

    ByteAddressBuffer VertexBuffer : register(t0);
    ByteAddressBuffer IndexBuffer  : register(t1);
    
    [numthreads(128, 1, 1)]
    void main(uint3 dispatchThreadId : SV_DispatchThreadID)
    {
      uint triangleIndex = dispatchThreadId.x * 3;
    
      // The below size is calculated based on the "VertexWithNormal" struct
      // defined in the C++ code:
      // 2 * sizeof(XMFLOAT4) + 4 * sizeof(int) = 2 * 4 * 4 + 4 * 4 = 48 bytes
      const uint vertexSizeInBytes = 48;
    
      // A triangle consists of 3 vertices. Thus, 3 vertex indices of the index
      // buffer form one triangle:
      // 3 * sizeof(int) = 3 * 4 = 12 bytes
      const uint indexOffset = triangleIndex * 12;
    
      const uint vertexOffset0 = IndexBuffer.Load(indexOffset)     * vertexSizeInBytes;
      const uint vertexOffset1 = IndexBuffer.Load(indexOffset + 4) * vertexSizeInBytes;
      const uint vertexOffset2 = IndexBuffer.Load(indexOffset + 8) * vertexSizeInBytes;
    
      const float4 vertexCoordinates0 = asfloat(VertexBuffer.Load4(vertexOffset0));
      const float4 vertexCoordinates1 = asfloat(VertexBuffer.Load4(vertexOffset1));
      const float4 vertexCoordinates2 = asfloat(VertexBuffer.Load4(vertexOffset2));
    
      // Calculations with the vertices...
    }
    

    Extracting information from the vertex and index buffer that way is error prone but I unfortunately did not find a better way. Apparently, the DirectX shader compiler does offer more streamlined ways of getting values as outlined here but that seems to be part of a compiler version newer than what I have available.