multithreadingasynchronousrust

async thread vs std thread


I'm curious about when to use tokio threads versus std threads.

In the past, I always thought async threads were for IO tasks and std threads were for CPU tasks.

However, from this discussion (https://www.reddit.com/r/rust/comments/13uvmr4/using_tokiospawn_vs_stdthreadspawn_for_a_game/), people mention that you can also write CPU tasks in async and await them.

So, I've come to these conclusions, but I'm not sure if they're correct:

  1. If you only have IO tasks, use async threads.
  2. If you only have CPU tasks, use std threads.
  3. If you have mixed tasks, use async threads.

Here's my context: I'm going to spawn 4 services where each service is asynchronous.

Should I use:

for _ in 0..4 {
    std::thread::spawn(tokio::runtime::Runtime::new()?.block_on(..))
}

Or

let rt = tokio::runtime::Builder::new_multi_thread()
        .worker_threads(4)
        .enable_all()
        .build()?;

for _ in 0..4 {
    rt.spawn(..)
}

Initially, I thought 4 std threads with 4 tokio runtimes would be more efficient. Now I'm unsure. Which setup performs better?


Solution

  • I think we need to look at why we use async/await before we can answer this question. Async/await is a technique that allows for better usage of CPU resources on IO-bounded tasks by switching to other pending tasks when attempting to await a value that is not yet ready. On the other hand, what you refer to as std threads are actually OS threads. In that case, the OS manages scheduling and switching between threads during execution.

    Importantly, async/await runtimes are written on top of OS threads. OS threads can do everything async/await can do and more. However, they are generally very expensive to use. If I want to start a new OS thread, I need to tell the OS about it and then wait for the system scheduler to actually start it. That new thread also requires allocating system resources (like setting up a new system stack). Plus, every thread of every process on your machine is competing for CPU time, so yielding to the OS scheduler does not guarantee the OS will choose to switch to a thread you are interested in.

    As an example, lets say I wanted to write a simple web server. One approach using OS threads would be to create a new thread for every new connection I receive, tell the OS to block on that thread until we have received the full request, then send our response. However, we don't really need a full OS thread for each connection to complete this goal. An async/await program might use a single OS thread and handle the connections and incoming data with non-blocking calls one after another. In this way, we make much better use of the OS resources leading to overall better performance under load.

    If we want to, we can do all the same things ourselves using OS-threads (for more info look into epoll), but it is not a very enjoyable process. The real genius of async/await is that it lets us write programs that make use of non-blocking IO in an intuitive linear manor similar to its blocking counterpart. Technically speaking, OS threads will be able to achieve better performance 100% of the time. However, that doesn't really mean much. It is also true that assembly will be able to achieve better performance than <choose any language> 100% of the time.

    Rules of Thumb

    With that out of the way, there is no one correct answer.

    Conclusions

    1. If you only have IO tasks, use async threads.
    2. If you only have CPU tasks, use std threads.

    These first two conclusions generally hold up in most use cases.

    1. If you have mixed tasks, use async threads.

    This is highly use-case dependent. Personally, I only use async/await as a last resort due to all of the baggage it brings in and its tendency to infect your codebase (Read What Color is Your Function? for more info). Generally, try to choose an approach when starting a project and stick to that approach everywhere.