directx-12

Concurrent Write Operations


https://learn.microsoft.com/en-us/windows-hardware/drivers/display/enhanced-barriers

The one-writer-at-a-time policy still applies since two seemingly nonoverlapping write regions might still have overlapping cache lines.

Since there must be no pending commands or cache flush operations between ExecuteCommandLists boundaries, buffers might be initially accessed in an ExecuteCommandLists scope without a barrier.

I'm copying realtime data from an upload buffer into a (default) ring buffer, on the same thread, using the same queue, framebuffering the ID3D12CommandLists and ID3D12CommandAllocators.

Is there a possibly that writes (from different ExecuteCommandLists calls) could interfere with each other due to overlapping cache lines?

Edit

https://microsoft.github.io/DirectX-Specs/d3d/D3D12EnhancedBarriers.html

Unfortunately, due to how many GPUs manage resource caches, concurrent write operations to seemingly non-overlapping regions of the same subresource may still result in data corruption without a barrier. Therefore, buffers and simultaneous-access textures may be accessed by any number of read operations and up to one write operation concurrently as long as the read regions do not intersect the write regions.

What counts as a concurrent write operations? Does calling ExecuteCommandList before the previous calls finish (on the same queue, as done when framebuffering) count as concurrent write operations?


Solution

  • According to the D3D12 documentation, I believe when ring buffering you should use ID3D12CommandQueue::Wait to ensure there are no write-after-write hazards, even when you are only writing from a single queue.

    However, in practice this seems not to matter. https://asawicki.info/news_1722_secrets_of_direct3d_12_copies_to_the_same_buffer

    What a mess! It seems that Direct3D 12 requires putting explicit barriers between our commands sometimes, automatically synchronizes some others, and doesn't even describe it all clearly in the documentation. The only general rule I can think of is that it cannot track resources bound through descriptors (like SRV, UAV), but tracks those that are bound in a more direct way (as render target, depth-stencil, clear target, copy destination) and synchronizes them automatically.

    Edit

    One thing that does actually cause a write-after-write hazard, is if I try to share intermediate "PingPong" buffers between different Compute Queues.