rustrust-tokiorust-async-std

Why do asynchronous versions of a TCP echo server use 50x more memory than a synchronous one?


I have a simple TCP echo server using standard library:

use std::net::TcpListener;

fn main() {
    let listener = TcpListener::bind("localhost:4321").unwrap();
    loop {
        let (conn, _addr) = listener.accept().unwrap();
        std::io::copy(&mut &conn, &mut &conn).unwrap();
    }
}

It uses about 11 MB of memory:

standard library

Tokio

If I convert it to use tokio:

tokio = { version = "0.2.22", features = ["full"] }
use tokio::net::TcpListener;

#[tokio::main]
async fn main() {
    let mut listener = TcpListener::bind("localhost:4321").await.unwrap();
    loop {
        let (mut conn, _addr) = listener.accept().await.unwrap();
        let (read, write) = &mut conn.split();
        tokio::io::copy(read, write).await.unwrap();
    }
}

It uses 607 MB of memory:

tokio

async_std

Similarly, with async_std:

async-std = "1.6.2"
use async_std::net::TcpListener;

fn main() {
    async_std::task::block_on(async {
        let listener = TcpListener::bind("localhost:4321").await.unwrap();
        loop {
            let (conn, _addr) = listener.accept().await.unwrap();
            async_std::io::copy(&mut &conn, &mut &conn).await.unwrap();
        }
    });
}

It also uses 607 MB of memory:

async_std


Why do the asynchronous versions of the program use 55x more memory than the synchronous one?


Solution

  • I tried it here, and like you said in the comments, there are several 64MB blocks:

    ==> pmap -d $(pidof tokio)
    3605:   target/release/tokio
    Address           Kbytes Mode  Offset           Device    Mapping
    …
    0000555b2a634000     132 rw--- 0000000000000000 000:00000   [ anon ]
    00007f2fec000000     132 rw--- 0000000000000000 000:00000   [ anon ]
    00007f2fec021000   65404 ----- 0000000000000000 000:00000   [ anon ]
    00007f2ff0000000     132 rw--- 0000000000000000 000:00000   [ anon ]
    00007f2ff0021000   65404 ----- 0000000000000000 000:00000   [ anon ]
    00007f2ff4000000     132 rw--- 0000000000000000 000:00000   [ anon ]
    00007f2ff4021000   65404 ----- 0000000000000000 000:00000   [ anon ]
    …
    

    Those blocks are neither readable nor writable, so they aren't mapped and don't use any memory. They simply represent reserved address space.

    Moreover as you can see, each of those 65404K block comes immediately after a 132K block. Since 65404+132 is exactly 65536, I suspect that these blocks represent address space that is reserved in case the runtime needs to grow one of those 132K-blocks later on. Might be interesting to see how things look after a couple of hours and a few thousand connections.