cmultithreadingshared-librarieslibxml2

libxml2.so Init/CleanupParser usage for multiple processes with threads


I am using libxml2 as a shared library from different paralelly running processes on an arm/linux environment. The offical examples are one-off processes where the handling of InitParser() and CleanupParser() are trivial.

In a setup where multiple processes use libxml2 I am not sure how to use it.

proc1 -thread1-> work -> libxml2.so -> ...
      -thread2-> work -> ...
      -thread3-> work -> work -> libxml2.so -> work -> ...
proc2 -thread1-> wrapper.so -> libxml2.so -> work -> ...
      -thread2-> work -> ...
      -thread3-> wrapper.so -> libxml2.so -> ...

Problems


InitParser()

Initialization function for the XML parser. This is not reentrant. Call once before processing in case of use in multithreaded programs

CleanupParser()

it cleans up memory allocated by the library itself. It is a cleanup function for the XML library. It tries to reclaim all related global memory allocated for the library processing. It doesn't deallocate any document related memory. One should call xmlCleanupParser() only when the process has finished using the library and all XML/HTML documents built with it. See also xmlInitParser() which has the opposite function of preparing the library for operations. WARNING: if your application is multithreaded or has plugin support calling this may crash the application if another thread or a plugin is still using libxml2

I guess, the question reveals my limited knowledge of the inner-workings of shared-libs... Thanks for any help!


Solution

  • Nowadays, xmlInitParser is thread-safe. But libxml2 will designate the first thread that calls the function as "main thread" internally. It's best to call this function once from the main thread before spawning other threads, for example, at the beginning of main.

    xmlCleanupParser must never be called if any thread of the process invokes any other libxml2 functions afterwards. So xmlInitParser() ... xmlCleanupParser() on a thread level is not OK. You normally don't have to call xmlCleanupParser at all, but you'll need it to avoid false positives when checking for memory leaks with tools like valgrind or ASan. In this case, it's best to call it right before the process exits, for example at the end of main. Nowadays, xmlCleanupParser will be called automatically when libxml2 is unloaded on most platforms.