I read online that most moden UNIX systems come with a thread-safe malloc() by default. I know this simply means that a thread can call malloc() safely while another thread is already in the middle of a malloc() call itself.
I am using pthreads for my multithreading. I have a 12-core CPU with 2 threads per core. So 24 threads in total. Also, I'm using the GNU C library implementation of malloc.
My question is about doing them at the same time without locking/waiting/blocking. I read in an answer that malloc() "uses internal locking mechanisms" when multiple threads are calling it at the same time.
So here's my question exactly:
If 8 threads happen to call malloc() at the exact same time, would there be 8 malloc calls happening in parallel, and they won't interfere with each other whatsoever?
Or is it the case that when one thread calls malloc(), the other threads MUST WAIT for this thread's malloc call to finish BEFORE they can proceed with their own malloc calls?
(I'm asking this because I just multithreadified a C program of mine that does make extensive use of malloc() and free(), and the speedup was not linear with threads used, even though logically it should have been, because none of the threads rely on anything global so no contention should be happening (in software anyway). My scenario is simple: Each thread calls a function that takes roughly 315 seconds to complete on 1 thread (no multithreading), which makes millions of other calls to functions I've defined. Since function code is read-only, there should be no problem in speedup of X threads running this top-level function in parallel, given that each thread called it with its own arguments and no threads are relying on anything global or shared. When I used 4 threads, the time for some reason went up from 315 sec to 710 seconds, and when I used 8 threads, the time went up to 1400 seconds, even though each thread is doing exactly the same work that the one thread without multithreading was doing, and was taking 315 seconds to complete. So, what the hell??)
If 8 threads happen to call malloc() at the exact same time, would there be 8 malloc calls happening in parallel, and they won't interfere with each other whatsoever?
It depends on the malloc()
implementation, among other things. Modern C standard libraries for general-purpose operating systems generally cater to simultaneous multiprocessing.
Glibc's malloc, for example, maintains multiple memory arenas from which to allocate, so as to avoid a single malloc()
call forcing all others to block until it completes. It manages these adaptively, but by default allows up to eight times as many arenas as there are CPUs in the system. This is per process, of course. If you are running on a Glibc-based system, then, it might indeed be that your 8 malloc
calls all proceed simultaneously. No interference whatsoever is a very high bar, but I think it would be safe to say that there would usually be minimal interference.
The answer might be different on other systems. In particular, Windows' allocator has a poor performance reputation in general, though I don't know specifically about how well it handles in multithreaded apps.
Nevertheless, if your threads are doing so much dynamic memory management that you supposed that was a likely source of your performance issue, then that's probably too much. Even if it's not specifically an issue for scaling up the number of threads, malloc
and free
are comparatively slow, so you should try to minimize their use where performance is important.