pythondaemonmultiprocessingchildrenzombie-process

Python-daemon doesn't kill its kids


When using python-daemon, I'm creating subprocesses likeso:

import multiprocessing

class Worker(multiprocessing.Process):
   def __init__(self, queue):
      self.queue = queue # we wait for things from this in Worker.run()

   ...

q = multiprocessing.Queue()

with daemon.DaemonContext():
    for i in xrange(3):
       Worker(q)

    while True: # let the Workers do their thing
       q.put(_something_we_wait_for())

When I kill the parent daemonic process (i.e. not a Worker) with a Ctrl-C or SIGTERM, etc., the children don't die. How does one kill the kids?

My first thought is to use atexit to kill all the workers, likeso:

 with daemon.DaemonContext():
    workers = list()
    for i in xrange(3):
       workers.append(Worker(q))

    @atexit.register
    def kill_the_children():
        for w in workers:
            w.terminate()

    while True: # let the Workers do their thing
       q.put(_something_we_wait_for())

However, the children of daemons are tricky things to handle, and I'd be obliged for thoughts and input on how this ought to be done.

Thank you.


Solution

  • Your options are a bit limited. If doing self.daemon = True in the constructor for the Worker class does not solve your problem and trying to catch signals in the Parent (ie, SIGTERM, SIGINT) doesn't work, you may have to try the opposite solution - instead of having the parent kill the children, you can have the children commit suicide when the parent dies.

    The first step is to give the constructor to Worker the PID of the parent process (you can do this with os.getpid()). Then, instead of just doing self.queue.get() in the worker loop, do something like this:

    waiting = True
    while waiting:
        # see if Parent is at home
        if os.getppid() != self.parentPID:
            # woe is me! My Parent has died!
            sys.exit() # or whatever you want to do to quit the Worker process
        try:
            # I picked the timeout randomly; use what works
            data = self.queue.get(block=False, timeout=0.1)
            waiting = False
        except queue.Queue.Empty:
            continue # try again
    # now do stuff with data
    

    The solution above checks to see if the parent PID is different than what it originally was (that is, if the child process was adopted by init or lauchd because the parent died) - see reference. However, if that doesn't work for some reason you can replace it with the following function (adapted from here):

    def parentIsAlive(self):
        try:
            # try to call Parent
            os.kill(self.parentPID, 0)
        except OSError:
            # *beeep* oh no! The phone's disconnected!
            return False
        else:
            # *ring* Hi mom!
            return True
    

    Now, when the Parent dies (for whatever reason), the child Workers will spontaneously drop like flies - just as you wanted, you daemon! :-D