Multiple devices vs. one device with node masks used

I'm confused with directx12 approach of handling multi-GPU systems.
What is the difference between using distinct ID3D12Devices and using a single device with node-masks?
Why does the API even provide two different approaches?

I thought when you enumerate IDXGIAdapters via IDXGIFactory1::EnumAdapters(), you are iterating over the actual GPU hardware(plus software emulations), which you can use to create ID3D12Device using D3D12CreateDevice(adapter, ...). To my understanding each of the IDXGIAdapter refer to the different GPUs. So then when you instantiate them into ID3D12Devices all of your different ID3D12Devices are the different GPUs.

That all makes sense to me until the API started doing things like device->CreateCommandList( nodeMask, ...) where the node mask refers to the multiple GPUs. This is super confusing then, because I thought the device refers to single GPU and you would just call CreateCommandList for each GPU/device separately, but instead they introduced the node mask. So then does it matter what device I use to call things like CreateCommandList? you think the API would have used the DXGI factory instead to do things like CreateCommandList, but it didn't, is there a reason?

Any help is much appreciated

Solution

Node masks are used for two or more physically linked GPUs like crossfire(mgpu?) or SLI/nvlink, those will be treated as single adapter, so node masks are used instead(they wont be available as separate adapters).
Note that its mostly available only for physically linked GPUs - otherwise there is only one node(its up to driver/vendor).

Basically with node masks you create adapter and device exactly the same way as with single gpu but there is possibility that your device has multiple nodes, so you call ID3D12Device::GetNodeCount and create your command queues, command allocators, command lists, fences, descriptors etc separately for each node(node mask is 1 << index where index is 0 <= index < nodecount), everything else can be shared between your nodes(including root signature, pipeline state, and resoures with careful usage), more info in docs.

So then does it matter what device I use to call things like CreateCommandList?

It does matter which device you use to create command list - it's bound to device/node you created it with and can't be used with interfaces created using different device(or for different node), as well as other interfaces(should be trivial to figure out but once again docs). However devices created using same adapter(adapter with same ID), are essentially references to single device.

you think the API would have used the DXGI factory instead to do things like CreateCommandList, but it didn't, is there a reason?

Not quite, dxgi factory is used to enumrate adapters, and node masks are just used for different purpose.. overall there is no point to do such coupling. But of course it would've been better if API could've avoid node masks and used some special adapters properties to connect them explicitly, but node masks are used/supported very rarely anyway because it requires hardware/driver support

Overall it's a weird design indeed but it's partially related to nature of tech like SLI, for example GPUs often have to be exactly the same for SLI to work, so why not represent it as one device with additional features anyway.

But of course you can just ignore other nodes and use your device regularly as with single gpu.

Msdn has plenty information about that: https://learn.microsoft.com/en-us/windows/win32/direct3d12/multi-engine

and a sample https://learn.microsoft.com/en-us/samples/microsoft/directx-graphics-samples/d3d12-linked-gpu-sample-uwp/

also sample without node masks but using separate devices https://learn.microsoft.com/en-us/samples/microsoft/directx-graphics-samples/d3d12-heterogeneous-multiadapter-sample-uwp/

some nvidia post https://developer.nvidia.com/explicit-multi-gpu-programming-directx-12