pythonjavamultithreadingfastapiproject-loom

Java's Spring Boot vs Python's FastAPI: Threads


I'm a Java Spring boot developer and I develop 3-tier crud applications. I talked to a guy who seemed knowledgeable on the subject, but I didn't get his contact details. He was advocating for Python's FastAPI, because horizontally it scales better than Spring boot. One of the reasons he mentioned is that FastAPI is single-threaded. When the thread encounters a database lookup (or other work the can be done asyncly), it picks up other work to later return to the current work when the database results have come in. In Java, when you have many requests pending, the thread pool may get exhausted.

I don't understand this reasoning a 100%. Let me play the devil's advocate. When the Python program encounters an async call, it must somehow store the program pointer somewhere, to remember where it needs to continue later. I know that that place where the program pointer is stored is not at all a thread, but I have to give it some name, so let's call it a "logical thread". In Python , you can have many logical threads that are waiting. In Java, you can have a thread pool with many real threads that are waiting. To me, the only difference seems to be that Java's threads are managed at the operating system level, whereas Python's "logical threads" are managed by Python or FastAPI. Why are real threads that are waiting in a thread pool so much more expensive than logical threads that are waiting? If most of my threads are waiting, why can't I just increase the thread pool size to avoid exhaustion?


Solution

  • The issues with Java threads in the question are addressed by the project Loom, which is now included in Jdk21. It is very well explained here https://www.baeldung.com/openjdk-project-loom :

    Presently, Java relies on OS implementations for both the continuation [of threads] and the scheduler [for threads].

    Now, in order to suspend a continuation, it's required to store the entire call-stack. And similarly, retrieve the call-stack on resumption. Since the OS implementation of continuations includes the native call stack along with Java's call stack, it results in a heavy footprint.

    A bigger problem, though, is the use of OS scheduler. Since the scheduler runs in kernel mode, there's no differentiation between threads. And it treats every CPU request in the same manner. (...) For example, consider an application thread which performs some action on the requests and then passes on the data to another thread for further processing. Here, it would be better to schedule both these threads on the same CPU. But since the [OS] scheduler is agnostic to the thread requesting the CPU, this is impossible to guarantee.

    The question really boils down to Why are OS threads considered expensive?