I've noticed that on macOS dlopen
is massively slower when used on a freshly created file than when used to open the very same file already existing on disk. Take a look at the following little demo program:
This is the lib code:
__attribute__((visibility("default"))) int addfunc(int a, int b)
{
return a + b;
}
As you can see, it's as basic as it gets with just a single addition. No external dependencies or anything.
It's compiled like this:
gcc -fPIC -dynamiclib -o mylib.dylib lib.c
And this is the tester code:
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
#include <sys/time.h>
static void subtime(struct timeval *dest, struct timeval *src)
{
if(dest->tv_usec < src->tv_usec) {
dest->tv_sec = dest->tv_sec - src->tv_sec - 1;
dest->tv_usec = 1000000 - (src->tv_usec - dest->tv_usec);
} else {
dest->tv_sec = dest->tv_sec - src->tv_sec;
dest->tv_usec = dest->tv_usec - src->tv_usec;
}
}
static void testcase(char *file)
{
void *lib;
int (*addfunc)(int a, int b);
struct timeval oldt, newt;
gettimeofday(&oldt, NULL);
lib = dlopen(file, RTLD_LAZY);
gettimeofday(&newt, NULL);
subtime(&newt, &oldt);
printf("dlopen() duration: %d %d\n", (int) newt.tv_sec, (int) newt.tv_usec);
addfunc = dlsym(lib, "addfunc");
printf("Testing lib: %d\n", addfunc(5, 6) == 11);
dlclose(lib);
}
int main(int argc, char *argv[])
{
char *buf;
FILE *fp;
int size;
printf("Now trying dlopen() from regular file...\n");
testcase("mylib.dylib");
fp = fopen("mylib.dylib", "rb");
fseek(fp, 0, SEEK_END);
size = (int) ftell(fp);
fseek(fp, 0, SEEK_SET);
buf = malloc(size);
fread(buf, size, 1, fp);
fclose(fp);
fp = fopen("tmpfile", "wb");
fwrite(buf, size, 1, fp);
fclose(fp);
free(buf);
printf("Now trying dlopen() from newly created file...\n");
testcase("tmpfile");
remove("tmpfile");
return 0;
}
The tester code is compiled like this:
gcc -o loader loader.c
The tester code will do two things:
mylib.dylib
directly. This takes less than 500 microseconds here.mylib.dylib
to a new file named tmpfile
and then it will dlopen the newly created tmpfile
. To my surprise, this is slow as hell because it usually takes at least 200.000 microseconds which is ages more than the first case.So what is going on here? Is macOS first phoning to Cupertino to check if there is no malware in the newly created file before it can be dlopen
ed or why on earth is this so slow?
I'm on a barebones macOS 13.6.7. I don't have any malware/virus scanners installed that could be responsible for this slowness. It must come from macOS itself.
Does anybody have an explanation for what I'm seeing here and more importantly: is there any way to speed this up?
This is code signature caching at the vnode layer.
You can see this by first running:
sudo sysctl vm.cs_debug=1
log stream --process 0 --predicate 'sender == "AppleMobileFileIntegrity"'
If you recompile your dylib (or copy it and rename it back, or reboot the machine - just anything that clears the vnode cache), you will see that running ./loader
takes the same amount of time on both dlopen
calls, and you will see this in the log:
kernel: (AppleMobileFileIntegrity) AMFI: vnode_check_signature called with platform 1
kernel: (AppleMobileFileIntegrity) AMFI: '/private/tmp/aaa/mylib.dylib' has no CMS blob?
kernel: (AppleMobileFileIntegrity) AMFI: '/private/tmp/aaa/mylib.dylib': Unrecoverable CT signature issue, bailing out.
kernel: (AppleMobileFileIntegrity) AMFI: code signature validation failed.
kernel: (AppleMobileFileIntegrity) AMFI: vnode_check_signature called with platform 1
kernel: (AppleMobileFileIntegrity) AMFI: '/private/tmp/aaa/tmpfile' has no CMS blob?
kernel: (AppleMobileFileIntegrity) AMFI: '/private/tmp/aaa/tmpfile': Unrecoverable CT signature issue, bailing out.
kernel: (AppleMobileFileIntegrity) AMFI: code signature validation failed.
For any subsequent run, you will only see this for the second binary:
kernel: (AppleMobileFileIntegrity) AMFI: vnode_check_signature called with platform 1
kernel: (AppleMobileFileIntegrity) AMFI: '/private/tmp/aaa/tmpfile' has no CMS blob?
kernel: (AppleMobileFileIntegrity) AMFI: '/private/tmp/aaa/tmpfile': Unrecoverable CT signature issue, bailing out.
kernel: (AppleMobileFileIntegrity) AMFI: code signature validation failed.
So yeah, first time executable mappings are slow, thank you very much Tim Apple.
As for what you can do about it, you could either unload AMFI in its entirety, if you only care about getting the maximum performance out of your machine and don't need to run any day-to-day apps. To do that, you'd need to boot into recovery, enable boot-args with bputil -a
, and then disable AMFI via sudo nvram boot-args='amfi_get_out_of_my_way=1'
. But doing so will mark all processes as "platform" (Apple) processes, for which the system applies some tighter restrictions than for 3rd party processes, so this is likely to make some apps crash.
Other than that, you could of course pre-load the code signature cache, either by parsing the Mach-O header, finding the code signature offset, and calling fcntl(F_ADDFILESIGS, ...)
with it like dyld does, or by simply calling dlopen
like you already do... but of course that only moves the operation elsewhere, and doesn't get rid of it.