clinuxmultiprocessinginfinite-loop

Fork in Linux: Weird infinite loop while freeing condition variable


Context

System: Fedora 40
Library Used: pthread.h

Details: Read "To the point" if not interested

I apologize for providing no code here as the whole program is more than 4000 lines of code and is not hosted on Github either. I will try my best to explain in words.

The program is a simple database. It is for my parent's shop. The computer running Linux has the program constantly running. I use files per customer to store everything about them. There are salesmen as well and so the program has to be able to open and ready files for many salesmen and customers at once.

The overall structure of the program is simple: A manager thread runs and it starts new thread per customer or salesmen. So far 6 threads were the max that it hit but with more salesmen, I think having one process entirely for salesmen was a better choice. The context switching hampers the program's speed and so when a file for a salesmen needed to be opened, the manager thread would create a new process.

To the point

Each thread maps 4KB of memory as ANONYMOUS and PRIVATE and has a memory manager for it. This mapped memory serves as cache for any previous extensive computation. Each thread has its own MUTEXES and CONDITION VARIABLES. There is also a request queue and a request handler managed by the Managing thread. The handler also has its own lock and condition variable.

What I found: I researched on multi-processing programs. I found out that it was better to re-initialize everything after forking so I used the Manager thread to do the fork. That way the Manager thread got replicated. But I also found that the mapped memory, locks and condition variables, malloced memory, basically everything also got inherited. The locks and condition variables of threads not running also got inherited but they weren't running so I freed all allocated memory, destroyed locks and unmapped the memory except when it came to condition variables, everytime, the third thread's condition variable could not be freed. For some reason, the pthread_cond_destroy function doesn't return at all.

LLM's Suggestion

I presented the problem to ChatGPT and Gemini and the solution I got: Don't attempt to free the inherited locks and condition variables or even the memory, just re-initialize everything. I couldn't bring myself to leave unused memory allocated.


Solution

  • Your entire design is fundamentally broken.

    Per the POSIX documentation for fork():

    A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

    In short:

    If a multi-threaded process calls fork() ... the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

    Per the Linux man page for fork():

    Note the following further points:

    • The child process is created with a single thread—the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.

    • After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).

    The Linux list of async-signal-safe functions can be found at the Linux signal-safety.7 man page.

    Deadlocks on locks held by another thread that doesn't exist in the child process are one possible result of failing to follow these limitations.

    And this?

    LLM's Suggestion

    I presented the problem to ChatGPT and Gemini and the solution I got: Don't attempt to free the inherited locks and condition variables or even the memory, just re-initialize everything. I couldn't bring myself to leave unused memory allocated.

    That is what's wrong with LLMs - they know nothing - they merely regurgitate whatever crap they've been fed - like "glue on a pizza". That lame advice ignores the fundamental restrictions placed on the child of a fork() from a multithreaded parent.