I'm trying to understand how PCI segment(domain) is related to multiple Host Bridges?
Some people say multiple PCI domains corresponds to multiple Host Bridges, but some say it means multiple Root Bridges under a single Host Bridge. I'm confused and I don't find much useful information in PCI SIG base spec.
I wonder
(1) Suppose I setup 3 PCI domains in MCFG, do I have 3 Host Bridges that connects 3 CPUs and buses, or do I have 3 Root Bridges that support 3x times buses but all share a common Host Bridge in one CPU?
(2) If I have multiple Host Bridges(or Root Bridges), do these bridges share a common South Bridge(e.g., ICH9), or they have separate ones?
I'm a beginner and google did not solve my problems much. I would appreciate it if someone could give my some clues.
The wording used is confusing.
I'll try to fix my terminology with a brief and incomplete summary of the PCI and PCI Express technology.
Skip to the last section to read the answers.
The Conventional PCI bus (henceforward PCI) is a designed around the bus topology: a shared bus is used to connect all the devices.
To create more complex hierarchies some devices can operate as bridge: a bridge connects a PCI bus to another, secondary, bus.
The secondary bus can be another PCI bus (the device is called a PCI-to-PCI bridge, henceforward P2P) or a bus of a different type (e.g. PCI-to-ISA bridge).
This creates a topology of the form:
_____ _______
----------| P2P |--------| P2ISA |------------- PCI BUS 0
‾‾|‾‾ ‾‾‾|‾‾‾
-------------|---------------+----------------- ISA BUS 0
|
-------------+--------------------------------- PCI BUS 1
Informally, each PCI bus is called a PCI segment.
In the picture above, two segments are shown (PCI BUS 0 and PCI BUS 1).
PCI defined three types of transactions: Memory, IO and configuration.
The first two are assumed to be required knowledge.
The third one is used to access the configuration address space (CAS) of each device; within this CAS it's possible to meta-configure the device.
For example, where it is mapped in the system memory address space.
In order to access the CAS of a device, the devices must be addressable.
Electrically, each PCI slot (either integrated or not), in a PCI bus segment, is wired to create an addressing scheme made of three parts: device (0-31), function (0-7), register (0-255).
Each device can have up to seven logical functions, each one with a CAS of 256 bytes.
A bus number is added to the triple above to uniquely identify a device within the whole bus topology (and not only within the bus segment).
This quadruplet is called ID address.
It's important to note that these ID addresses are assigned by the software (but for the device part, which is fixed by the wiring).
They are logical, however, it is advised to number the busses sequentially from the root.
The CPU doesn't generate PCI transactions natively, a Host Bridge is necessary.
It is a bridge (conceptually a Host-to-PCI bridge) that lets the CPU performs PCI transactions.
For example, in the x86 case, any memory write or IO write not reclaimed by other agents (e.g. memory, memory mapped CPU components, legacy devices, etc.) is passed to the PCI bus by the Host Bridge.
To generate CAS transactions, an x86 CPU writes to the IO ports 0xcf8
and 0xcfc
(the first contains the ID address, the second the data to read/write).
A CPU can have more than a Host Bridge, nothing prevents it, though it's very rare.
More likely, a system can have more than one CPU and with a Host Bridge integrated into each of them, a system can have more than one Host Bridge.
For PCI, each Host Bridge establishes a PCI domain: a set of bus segments.
The main characteristic of a PCI domain is that it is isolated from other PCI domains: a transaction is not required to be routable between domains.
An OS can assign the bus numbers of each PCI domain as it please, it can reuse the bus numbers or can assign them sequentially:
NON OVERLAPPING | OVERLAPPING
|
Host-to-PCI Host-to-PCI | Host-to-PCI Host-to-PCI
bridge 0 bridge 1 | bridge 0 bridge 1
|
| | | | |
| | | | |
BUS 0 | BUS 2 | | BUS 0 | BUS 0 |
| | | | | | | | |
+------+ +------+ | +------+ +------+
| | | | | | | | |
| | | | | | | | |
| BUS 1 | BUS 3 | | BUS 1 | BUS 1
Unfortunately, the word PCI domain has also a meaning in the Linux kernel, it is used to number each Host Bridge.
As far as the PCI is concerned this works, but with the introduction of PCI express, this gets confusing because PCI express has its own name for "Host Bridge number" (i.e. PCI segment group) and the term PCI domain denotes the downstream link of the PCI express root port.
The PCI Express bus (henceforward PCIe) is designed around a point-to-point topology: a device is connected only to another device.
To maintain a software compatibility, an extensive use of virtual P2P bridges is made.
While the basic components of the PCI bus were devices and bridges, the basic components of the PCIe are devices and switches.
From the software perspective, nothing is changed (but for new features added) and the bus is enumerated the same way: with devices and bridges.
The PCIe switch is the basic glue between devices, it has n downstream ports.
Internally the switch has a PCI bus segment, for each port a virtual P2P bridge is created in the internal bus segment (the virtual adjective is there because each P2P only responds to the CAS transaction, that's enough for PCI compatible software).
Each downstream port is a PCIe link.
A PCIe link is regarded as a PCI bus segment; this checks with the fact that the switch has a P2P bridge for each downstream port (in total there are 1 + n PCI bus segment for a switch).
A switch has one more port: the upstream port.
It is just like a downstream port but it uses a subtractive decoding, just like for a network switch, it is used to receive traffic from the "logical external network" and to route unknown destinations.
So a switch takes 1 + N + 1 PCI segment bus.
Devices are connected directly to a switch.
In the PCI case, a bridge connected the CPU to the PCI subsystem, so it's logical to expect a switch to connect the CPU to the PCIe subsystem.
This is indeed the case, with the PCI complex root (PCR).
The PCR is basically a switch with an important twist: each one of its ports establishes a new PCI domain.
This means that it is not required to route traffic from port 1 to port2 (while a switch, of course, is).
This creates a shift with the Linux terminology, as mentioned before, because Linux assigns a domain number to each Host Bridges or PCR while, as per specifications, each PCR has multiple domains.
Long story short: same word, different meanings.
The PCIe specification uses the word PCI segment group to define a numbering per PCR (simply put the PCI segment group is the base address of the extended CAS mechanism of each PCR, so there is a one-to-one mapped natively).
Due to their isolation property, the ports of the PCR are called PCIe Root Port.
Note
The term Root Bridge doesn't exist in the specification, I can only find it in the UEFI Root Bridge IO Specification as an umbrella term for both the Host Bridge and PCR (since they share similar duties).
The Host Bridge also goes under the name of Host Adapter.
(1) Suppose I setup 3 PCI domains in MCFG, do I have 3 Host Bridges that connects 3 CPUs and buses, or do I have 3 Root Bridges that support 3x times buses but all share a common Host Bridge in one CPU?
If you have 3 PCI domains you either have 3 Host Bridged or 3 PCIe root ports.
If by PCI domains you meant PCI buses, in the sense of PCI bus segments (irrespective of their isolation), then you can have either a single Host Bridge/PCR handling a topology with 3 busses or more than one Host Bridge/PCR handling a combination of the 3 busses.
There is no specific requirement in this case, as you can see it's possible to cascade busses with bridges.
If you want the bus to not be isolated (so not to be PCI domains) you need a single Host Bridge or a single PCIe root port.
A set of P2P bridges (either real or virtual) will connect the busses together.
(2) If I have multiple Host Bridges(or Root Bridges), do these bridges share a common South Bridge(e.g., ICH9), or they have separate ones?
The bridged platform had faded out years ago, we now have a System Agent integrated into the CPU that exposes a set of PCIe lanes (typically 20+) and a Platform Controller Hub (PCH), connected to the CPU with a DMI link.
The PCH can also be integrated into the same socket as the CPU.
The PCH exposes some more lanes that appear to be from the CPU PCR from a software perspective.
Anyway, if you had multiple Host Bridges, they were usually on different sockets but there was typically only a single south bridge for them all.
However, this was (and is not) strictly mandatory.
The modern Intel C620 PCH can operate in Endpoint Only Mode (EPO) where it is not used as the main PCH (with a firmware and boot responsibilities) but as a set of PCIe-endpoint.
The idea is that a Host Bridge just converts CPU transactions to PCI transactions, where these transactions are routed depends on the bus topology and this is by itself a very creative topic.
Where the components of this topology are integrated is another creative task, in the end it's possible to have separate chips dedicated for each Host Bridge or a big single chip shared (or partitioned) among all or even both at the same time!