infiniband

InfiniBand explained


Can anybody explain what is InfiniBand? What is the key differences in comparison with Ethernet, how these differences allow for it to be faster than Ethernet?

In the official description from mellanox it is written that

Introduce InfiniBand, a switch-based serial I/O interconnect architecture operating at...

What does it mean that Infiniband is a switch-based interconnect? I found this description, but it does not explain what happens if several inputs want to write to a single output, how is the collision resolved?

It is also said that Infiniband has end-to-end flow control. Does it mean that there is no (need) for any other (in-between) flow control? Why?


Solution

  • Key difference between Ethernet and Infiniband, which makes Infiniband faster, is RDMA (Remote Direct Memory Access). DMA (in networking) is an operation which access the memory directly from the NIC (Network Interface Controller), without involving the CPU. RDMA is the same idea, but the direct memory access is done by a remote machine.

    More differences:

    1. Communication is done between QPs (Queue Pairs) instead of channels.
    2. Data flow to/from user space straight to/from HW instead of going thru the kernel stack.

    A basic RDMA flow between a requestor and a responder would consist of:

    1. Handshake - exchange details between requestor and responder (mainly allocated memory addresses and access keys).
    2. Create a READ/WRITE/ATOMIC request on the requestor side.
    3. Send the request to the responder.
    4. Directly access the memory on the responder side.
    5. If READ/ATOMIC - send the data read from responder's memory back to the requestor.

    Main benefits:

    1. No CPU access on the responder side - throughput is limited by the HW (NIC & PCI) only.
    2. No SW is running on responder side - allows much lower latency (~10 times less than typical TCP/UDP latency).
    3. Supports "polling mode" for completion on requestor side, meaning the SW knows immediately once HW finished transmitting. Allows for lower latency and higher throughput, on the expense of high CPU utilization.

    For more information please refer to the Infiniband spec (sorry it is very long).

    Related traffic protocols:

    Hope this helps.