pythonpython-daemonopenrc

How can I correctly implement a Python daemon which could fail to start?


I have a Python daemon using the python-daemon package to daemonize. It can also be run to stay in foreground (by not using the -d command line parameter). It recurrently runs a function and also starts a minimal HTTP server that can be used to communicate with it (the full code can be found at GitLab).

Stripped down to a minimal example, it's:

#!/usr/bin/env python3

import sys
import signal
import argparse
import daemon
import daemon.pidfile
from syslog import syslog
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler
from time import strftime

parser = argparse.ArgumentParser()
parser.add_argument("-d", action = "store_true", help = "daemonize")
args = parser.parse_args()

class RequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.end_headers()
        self.wfile.write(b"I'm here")

class ProcessManager:
    def __init__(self):
        self.timer = None
        self.server = None
        self.signalCatched = False
        self.finished = threading.Event()

    def setup(self) -> bool:
        syslog("Setting up different stuff")
        # All kind of stuff that could fail, returning False then

        syslog("Setting up the HTTP server")
        try:
            self.server = HTTPServer(("127.0.0.1", 8000), RequestHandler)
        except Exception as error:
            syslog("Failed to setup the HTTP server")
            return False

        return True

    def start(self):
        thread = threading.Thread(target = self.server.serve_forever)
        thread.start()
        self.scheduleNextRun()

    def scheduleNextRun(self):
        if self.signalCatched:
            return

        syslog("Daemon running at {}".format(strftime("%Y-%m-%d %H:%M:%S")))

        self.timer = threading.Timer(3, self.scheduleNextRun)
        self.timer.start()

    def terminate(self, signum, frame):
        syslog("Catched signal, will now terminate")
        self.signalCatched = True

        if self.timer:
            self.timer.cancel()

        self.server.shutdown()

        self.finished.set()

def setupProcessManager():
    if not processManager.setup():
        sys.exit(1)

    signal.signal(signal.SIGTERM, processManager.terminate)
    signal.signal(signal.SIGINT, processManager.terminate)

    processManager.start()

processManager = ProcessManager()

if args.d:
    with daemon.DaemonContext(pidfile = daemon.pidfile.PIDLockFile("/run/test.pid")):
        syslog("Starting up in daemon mode")
        setupProcessManager()
        processManager.finished.wait()
else:
    syslog("Starting up in foreground mode")
    setupProcessManager()

I wrote a minimal OpenRC init script to run it as a daemon, which also works fine, I can start and stop the daemon like one would expect it.

The problem is that I can't detect if the startup failed. When it runs in daemon mode, the sys.exit(1) has no effect, because as soon as the pidfile is created, OpenRC counts this as a successful startup. Also, the parent firing up the daemon apparently exits successfully.

I can't setup the daemon outside of the DaemonContext. If I move the signal connections out of it, signals aren't handled anymore. If I only call processManager.start() inside the DaemonContext, the recurring function call works, but the HTTP server is not reachable.

So: How do I implement this correctly, so that everything keeps working, but the RC system is able to detect if the startup failed?


Solution

  • I finally could figure it out.

    The trick is to exit with a non-0 status before the daemon forks. I thought this was not possible, because the HTTP server would not work if I didn't set it up inside the DaemonContext. However it's possible to set up everything, including the HTTP server, before forking if the context is changed a bit (among other feedback from here and the Gentoo forums inspired by Daemonizing python's BaseHTTPServer):

    # Setup the process manager
    processManager = ProcessManager(args)
    if not processManager.setup():
        sys.exit(1)
    
    def startDaemon():
        if args.d:
            # Terminate on SIGTERM if we want to daemonize
            signal.signal(signal.SIGTERM, processManager.terminate)
        else:
            # Terminate on SIGINT (CRTL+C) if we're running in the foreground
            signal.signal(signal.SIGINT, processManager.terminate)
    
        # Start up the real thing
        processManager.start()
    
    # Start the process, either daemonized or in foreground
    if args.d:
        # We want to daemonize, create a DaemonContext
        daemonContext = daemon.DaemonContext()
        # Define the pidfile
        daemonContext.pidfile = daemon.pidfile.PIDLockFile(args.p)
        # This enables the already set up HTTP server to run inside the DaemonContext
        daemonContext.files_preserve = [ processManager.server.fileno() ]
        # Start the daemon
        with daemonContext:
            startDaemon()
            processManager.finished.wait()
    else:
        # We want to run in foreground
        startDaemon()
    

    Like this, the startup script can exit with non-0 if the startup is not possible, which will be processed by OpenRC.