c linux multithreading multiprocessing pthreads

Detaching a thread versus calling pthread_exit() from main(), consequences for resources and memory in both scenarios

I'm fairly new to C programming and am at the moment trying to wrap my head around the pthreads library and threading more generally.

Question 1: When and why is detaching a thread a nice option to have. It seems like a lot of the things I've read say that detaching a thread is beneficial when you want the thread to live on independently of the main thread. To me this seems to imply that in a lot of the use cases the main thread for whatever reason could be/is usually shorter lived. I'm wondering why it wouldn't be simpler to call pthread_exit() at the end of the main thread, which I believe exits main without killing the process and thereby the remaining thread (through some dark magic that I believe doesn't affect any memory or resources that have been "passed" (not really) from main onto the thread in question), which would effectively achieve the same thing as detaching a thread (?).

Question 2: Assuming similar if not the same functionality can be achieved using pthread_exit() from main as can be done by detaching a sub-thread. Are there any advantages memory or resource-wise for choosing one over the other. Don't both of these methods effectively do the same thing (i.e. keep the heap and other resources from being cleaned up until all (for pthread_exit() ) or specific (for detached threads) threads have finished.

Cheers for any help.

Solution

Q1: I know of two cases where detached threads are useful.

Sometimes you have a secondary thread that executes an infinite loop, and there's no good way to tell that thread that the whole process is about to exit so it should stop. If the main thread tries to join that thread before exiting, pthread_join will block forever. If it instead calls exit, or returns from main, without joining the looping thread, that secondary thread will be abruptly killed -- whether or not it's detached. Detaching that thread declares that you intend to exit out from under it because it can't be joined because it loops forever, it's not a bug.

Sometimes you have a main thread that ought to spend most of its time blocked on an I/O operation, such as accept, and dispatches work to secondary threads. There is no way to wait for either a thread to be joinable or some other kind of I/O to be ready. Detaching the secondary threads is one way to make the main thread not need to call pthread_join.

Q1a: In terms of what happens, the only difference between a detached thread, and a non-detached thread that no other thread bothers to join, is that the C library has no way of knowing that nobody’s going to bother to join the non-detached thread, so it won’t free up all the resources associated with that thread when it exits. It should free most of those resources, but there’s no requirement for it to; in the worst case (if something is continuously creating threads without either detaching or joining them) this could amount to a substantial resource leak.

In the scenario you’re talking about, where main creates a small, fixed number of long-lived threads and then calls pthread_exit, this resource leak isn’t worth worrying about. However, if I were writing the program, I would still either detach or join all of the threads, for three reasons. First, it’ll be easier for someone else reading my code (or, for that matter, me reading my code again a couple years from now) to understand that the program is correct if all threads are either detached or joined. You probably don’t have the experience yet to understand just how important this is, so, exercise for you: dig up a program that you wrote at least six months ago and haven’t touched since, go through it line by line and attempt to explain to another student how it works. If you haven’t got another student, a rubber duck or stuffed animal will be almost as good.

Second, calling pthread_exit from main means you are abandoning any opportunity to find out if the other threads succeeded in doing what they were supposed to do. If you instead join all the other threads, then you can have those threads pass back information about whether they succeeded, and you can aggregate that information into the process’s exit status (the number returned from main). This is especially important for short-lived programs that might get run from shell scripts, because shell scripts specifically look at process exit statuses to detect when some operation failed.

Third, a program that creates N long-lived threads from main and then calls pthread_exit can always be restructured to create N − 1 long-lived threads from main and then do whatever the Nth thread was going to do directly in the main thread, which is slightly more efficient and may be easier to understand. This isn’t nearly as important as the other two reasons, though.

Q2: Indeed, in general, short-lived programs need not bother with fine-grained resource deallocation and in fact it's often more efficient not to clean things up on the way out. (The time cost of calling free for tens of thousands of small allocations, for instance, can be substantial, and all that work is redundant to the OS erasing the entire address space when the process exits.)

People will tell you sternly never to skip that work; this is because you want to be in the habit of carefully managing resources so you don't mess it up when you do write a long-lived program. But once you get into that habit, it's completely fine to decide on purpose to skip the cleanup for a program that does its thing and exits.

Q3: When you call pthread_exit, only the calling thread is terminated (even if it’s the main thread). If the thread is detached, all of the resources associated directly with the thread are freed at this point. If it’s not detached, a small amount of data has to be preserved until someone joins the thread, but most of the resources should be freed immediately. (As I mentioned above, though, there’s no actual requirement for any of it to be freed until the thread is joined. This is because it’s actually quite difficult to free as much as possible as soon as the thread exits! Ask a separate question about that if you want to know why.)

Thing is, though, there aren’t very many “resources associated directly with a thread.” The biggest item is the thread’s stack. There’s also a chunk of memory called a “thread control block” or “thread descriptor,” and there may be some kernel resources: a “light weight process,” whatever that means on your OS, and maybe some locks and stuff.

Basically all other resources are associated with the process and don’t get deallocated until explicitly released or the entire process exits. Important things that will not get deallocated when a thread exits, even if the remaining threads cannot access them anymore, include:

Open files
Memory allocated with malloc and friends
Memory mappings (mmap)
Locks held by the exited thread
The program’s code
Global data