the code that I'm talking about is the repro code linked below
I have 2 "states" in the code (both of them are listening to queue global events):
For some reason, Bull job is not reported as stalled when the listener worker is terminated only when the worker is back up.
although it should as stated in the docs:
.on('stalled', function (job) { // A job has been marked as stalled. This is useful for debugging job // workers that crash or pause the event loop. });
From Bull event reference in
.on('stallled', ...)
event handler
GitHub Repro (it contains docker-compose
and everything set up with explanation)
Edit:
Tested on bullmq and it works
I am afraid this works as designed, since in Bull the workers are also responsible of detecting stalled jobs, while in BullMQ you have the QueueScheduler that takes care of this.