I work for a company that develops its own game-engine. One of its features is to dynamically load and unload dlls (plugins). I've noticed a weird behavior whenever my process frees a specific dll - which results in significant performance drop.
I've tried countless attempts to identify what's causing the issues (see below), but nothing seems to work. ChatGPT suggestions were useless, and I don't know what do to from here.
Full details:
- This is a specific dll (that we also develop here, thus we have full access to its source). This dll is also linked to multiple other dlls and libs
- The loading app only loads this dll (with LoadLibrary()), and then unloads it (using FreeLibrary())
- Once this is done, we can see a major performance downgrade.
- This only happens on debug.
Here's what I already tried:
- I used VS profiling tools to see any bottle neck. nothing shown up. also checked memory leaks, IO, and everything that VS has to offer.
- I wrote a simple program that does nothing but load the dll, and unload the dll, and between every step of the way I did some "calculation" that normally takes about 3 seconds. it consistently shows that after the dll is unloaded we get about 30% performance downgrade, and if we try to reload it again, we get about 400% performance drop (now the task takes about 12 seconds). (when doing the same experiment from the game engine, we drop from about 120FPS to 0.5 FPS, so even greater performance drop).
- Looked at the code of this Dll - made sure it was only this DLL that had this issues. I look at UCRT implementation and tried to identify each global and static variable that is initialized (there were a lot).
- I went to each constructor and destructor that I could find and made their implementation trivial. Made sure there were no locks, no critical sections, no threads that are waiting to join. made sure memory allocations are not overwritten (they were, so that allocations are aligned).
- I looked at procmon to see if I can spot something, couldn't (though my knowledge is very limited).
- looked at 'depends' to see if I get a hint. nothing.
- I debugged the code after LoadLibrary() (and FreeLibrary()) is called, to see if there are some hanging threads, or resourceses that are not freed correctly - couldn't find anything.
The only thing that worked was if I loaded the dll using:
LoadLibraryEx given the flag: LOAD_LIBRARY_AS_IMAGE_RESOURCE, but this isn't very helpful, since the global and static variables are not initialized in this case.
I read suggestions that this could be TLS related - I tried finding all places with relevant code and remove it.
What I didn't do:
- removed all static and dynamic linking this dll has (it links to about 10 other dlls, as well as alot of WinDlls). This is not trivial at all, as there are thousands of references in this dll.
- went over all all static and global variables - there are literally more than 10000 of them.
At this point I'm lost, I would appreciate and suggestions, tools, ideas.
After basically removing all possible chunks of code from our implementation - it turned out that we had a (stupid) wrapper class that wraps MFCLeakReport class.
In it, there were calls to _CrtSetReportHook. Removing these calls completely solved the issue. As for why this was there for my company, or how to implement it correctly is irrelevant for the solution - but if anyone ever encounters similar behavior - try looking for MFC hooks.