libp2p

Difference between multiplex and multistream


What is the difference between multistream (yamux, multistream-select, ..) and multiplex (mplex)? I'd like to utilize one TCP connection for RPC, HTTP, etc (one client is behind firewall) like this:

conn = tcp.connect("server.com:1111")
conn1, conn2 = conn.split()

stream1 = RPC(conn1)
stream2 = WebSocket(conn2)
..

// received packets tagged for conn1 is forwarded to stream1
// received packets tagged for conn2 is forwarded to stream2
// writing to stream1 tags the packets for conn1
// writing to stream2 tags the packets for conn2

Which one suits this case?


Solution

  • The short answer: mplex and yamux are both Stream Multiplexers (aka stream muxers), and they're responsible for interleaving mulitiple "logical streams" over a single "raw" connection (e.g. TCP). Multistream is used to identify what kind of protocol should be used when sending / receiving data over the stream, and multistream-select lets peers negotiate which protocols are supported by each end and hopefully agree on one to use.

    Long answer:

    Stream muxing is an interface with several implementations. The "baseline" stream muxer is called mplex - a libp2p-specific protocol with implementations in javascript, go and rust.

    Stream multiplexers are "pluggable", meaning that you add support for them by pulling in a module and configuring your libp2p app to use them. A given libp2p application can support several multiplexers at the same time, so for example, you might use yamux as the default but also support mplex to communicate with peers that don't support yamux.

    While having this kind of flexibility is great, it also means that we need a way to figure out what stream muxer to use for any specific connection. This is where multistream and multistream-select come in.

    Multistream (despite the name) is not directly related to stream multiplexing. Instead, it acts as a "header" for a stream of binary data that contextualizes the stream with a protocol id. The closely-related multistream-select protocol uses mutlistream protocol ids to negotiate what protocols to use for the "next phase" of communication.

    So, to agree upon what stream muxer to use, we use multistream-select.

    Here's an example the multistream-select back-and-forth:

    /multistream/1.0.0 <- dialer says they'd like to use multistream 1.0.0
    /multistream/1.0.0 -> listener echoes back to indicate agreement
    /secio/1.0.0       <- dialer wants to use secio 1.0.0 for encryption
    /secio/1.0.0       -> listener agrees
    
    * secio handshake omitted. what follows is encrypted via secio: *
    
    /mplex/6.7.0       <- dialer would like to use mplex 6.7.0 for stream multiplexing
    /mplex/6.7.0       -> listener agrees
    

    This is the simple case where both sides agree upon everything - if e.g. the listener didn't support /mplex/6.7.0, they could respond with na (not available), and the dialer could either try another protocol, ask for a list of supported protocols by sending ls, or give up.

    In the example above, both sides agreed on mplex, so future communication over the open connection will be subject the semantics of mplex.

    It's important to note that most of the details above will be mostly "invisible" to you when opening individual connections in libp2p, since it's rare to use the multistream and stream muxing libraries directly.

    Instead, a libp2p component called the "switch" (also called the "swarm" by some implementations) manages the dialing / listening state for the application. The switch handles the multistream negotiation process and "hides" the details of which specific stream muxer is in use from the rest of the libp2p stack.

    As a libp2p developer, you generally dial other peers using the switch interface, which will give you a stream to read from and write to. Under the hood, the switch will find the appropriate transport (e.g. TCP / websockets) and use multistream-select to negotiate encryption & stream multiplexing. If you already have an open connection to the remote peer, the switch will just use the existing connection and open another muxed stream over it, instead of starting from scratch.

    The same goes for listening for connections - you give the switch a protocol id and a stream handler function, and it will handle the muxing & negotiation process for you.

    Our documentation is a work-in-progress, but there is some information at https://docs.libp2p.io that might help clarify, especially the concept doc on Transports and the glossary. You can also find links to example code.

    Improving the docs for libp2p is my main quest at the moment, so please feel free to file issues at https://github.com/libp2p/docs to let me know what your most important missing pieces are.