I have observed a blocking behavior of C NIFs when they were being called concurrently by many Erlang processes. Can it be made non-blocking? Is there a mutex
at work here which I'm not able to comprehend?
P.S. A basic "Hello world" NIF can be tested by making it sleep
for a hundred microseconds
in case of a particular PID
calling it. It can be observed that the other PIDs calling the NIF wait for that sleep to execute before their execution.
Non blocking behavior would be beneficial in cases where concurrency might not pose an issue(e.g. array push, counter increment).
I am sharing the links to 4 gists which comprise of a spawner
, conc_nif_caller
and niftest
module respectively. I have tried to tinker with the value of Val
and I have indeed observed a non-blocking behavior. This is confirmed by assigning a large integer parameter to the spawn_multiple_nif_callers
function.
Links spawner.erl,conc_nif_caller.erl,niftest.erl and finally niftest.c.
The line below is printed by the Erlang REPL on my Mac.
Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
I think the responses about long-running NIFs are off the mark, since your question says you're running some simple "hello world" code and are sleeping for just 100 us. It's true that ideally a NIF call shouldn't take more than a millisecond, but your NIFs likely won't cause scheduler issues unless they run consistently for tens of milliseconds at a time or more.
I have a simple NIF called rev/1
that takes a string argument, reverses it, and returns the reversed string. I stuck a usleep
call in the middle of it, then spawned 100 concurrent Erlang processes to invoke it. The two thread stacktraces shown below, based on Erlang/OTP 17.3.2, show two Erlang scheduler threads both inside the rev/1
NIF simultaneously, one at a breakpoint I set on the NIF C function itself, the other blocked on the usleep
inside the NIF:
Thread 18 (process 26016):
#0 rev (env=0x1050d0a50, argc=1, argv=0x102ecc340) at nt2.c:9
#1 0x000000010020f13d in process_main () at beam/beam_emu.c:3525
#2 0x00000001000d5b2f in sched_thread_func (vesdp=0x102829040) at beam/erl_process.c:7719
#3 a0x0000000100301e94 in thr_wrapper (vtwd=0x7fff5fbff068) at pthread/ethread.c:106
#4 0x00007fff8a106899 in _pthread_body ()
#5 0x00007fff8a10672a in _pthread_start ()
#6 0x00007fff8a10afc9 in thread_start ()
Thread 17 (process 26016):
#0 0x00007fff8a0fda3a in __semwait_signal ()
#1 0x00007fff8d205dc0 in nanosleep ()
#2 0x00007fff8d205cb2 in usleep ()
#3 0x000000010062ee65 in rev (env=0x104fcba50, argc=1, argv=0x102ec8280) at nt2.c:21
#4 0x000000010020f13d in process_main () at beam/beam_emu.c:3525
#5 0x00000001000d5b2f in sched_thread_func (vesdp=0x10281ed80) at beam/erl_process.c:7719
#6 0x0000000100301e94 in thr_wrapper (vtwd=0x7fff5fbff068) at pthread/ethread.c:106
#7 0x00007fff8a106899 in _pthread_body ()
#8 0x00007fff8a10672a in _pthread_start ()
#9 0x00007fff8a10afc9 in thread_start ()
If there were any mutexes within the Erlang emulator preventing concurrent NIF access, the stacktraces would not show both threads inside the C NIF.
It would be nice if you were to post your code so those willing to help resolve this issue could see what you're doing and perhaps help you find any bottlenecks. It would also be helpful if you were to tell us what version(s) of Erlang/OTP you're using.