linuxmultithreadingunixprogramming-languagesfork

Forking vs Threading


I have used threading before in my applications and know its concepts well, but recently in my operating system lecture I came across fork(). Which is something similar to threading.

I google searched the difference between them and I came to know that:

  1. Fork is nothing but a new process that looks exactly like the old or the parent process but still it is a different process with different process ID and having it’s own memory.
  2. Threads are light-weight process which have less overhead

But, there are still some questions in my mind.

  1. When should you prefer fork() over threading and vice-versa?
  2. If I want to call an external application as a child, then should I use fork() or threads to do it?
  3. While doing google search I found people saying it is bad thing to call a fork() inside a thread. Why do people want to call a fork() inside a thread when they do similar things?
  4. Is it true that fork() cannot take advantage of multiprocessor system because parent and child process don't run simultaneously?

Solution

  • The main difference between forking and threading approaches is one of operating system architecture. Back in the days when Unix was designed, forking was an easy, simple system that answered the mainframe and server type requirements best, as such it was popularized on the Unix systems. When Microsoft re-architected the NT kernel from scratch, it focused more on the threading model. As such there is today still a notable difference with Unix systems being efficient with forking, and Windows more efficient with threads. You can most notably see this in Apache which uses the prefork strategy on Unix, and thread pooling on Windows.

    Specifically to your questions:

    When should you prefer fork() over threading and vice-verse?

    On a Unix system where you're doing a far more complex task than just instantiating a worker, or you want the implicit security sandboxing of separate processes.

    If I want to call an external application as a child, then should I use fork() or threads to do it?

    If the child will do an identical task to the parent, with identical code, use fork. For smaller subtasks use threads. For separate external processes use neither, just call them with the proper API calls.

    While doing google search I found people saying it is bad thing to call a fork() inside a thread. why do people want to call a fork() inside a thread when they do similar things?

    Not entirely sure but I think it's computationally rather expensive to duplicate a process and a lot of subthreads.

    Is it True that fork() cannot take advantage of multiprocessor system because parent and child process don't run simultaneously?

    This is false, fork creates a new process which then takes advantage of all features available to processes in the OS task scheduler.