pythonmultithreadingtkinterloggingmultiprocessing

How to share a configured logger between running processes?


I'm trying to build a small GUI app with a logger that will, at some point, be doing some time consuming manipulation of multiple sets of data (up to several hundred). Naturally, I wanted to use multiprocessing to help speed things up a bit. I was following the example given in the Logging Cookbook (second example) in the Python docs and trying to figure out how to work it into my code. In this stripped down minimal example, clicking the build button should simply log a few messages. The problem, which is hopefully obvious to someone more learned in the topics at hand, is that it doesn't work as intended. The application only prints 3 of the 5 messages to the console and exactly zero are added to the log file.

What I was expecting, obviously, was that all 5 messages would be logged with the logger instance created in gui.py.

I've tried merging methods, moving methods out of the class into module level functions, creating the Queue/loggers in different places, and passing the first logger instance around as an argument. Everything I've tried up to this point either leads to the same results or throws a pickling error and eventually ends in an EOFError. The code given is just the most recent revision that doesn't throw an exception.

I'm just trying to get some direction on "where" I'm messing up. In case it matters, this is on Windows 10 using Python 3.12.

# gui.py

from multiprocessing import Queue
import tkinter as tk
from tkinter import ttk

import builder
import logger

class Gui(tk.Tk):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        btn = ttk.Button(self, text='Build', command=lambda: builder.Build(q))
        btn.pack()
        log.configure(q)
        self.mainloop()

if __name__ == '__main__':
    q = Queue()
    log = logger.Logger()
    Gui()
# logger.py

import logging
import logging.handlers

class Logger:
    def __init__(self):
        self.logger = logging.getLogger('dcre')
        self.logger.setLevel('DEBUG')
        self.log = self.logger.log

    def configure(self, q):
        self.q = q
        self.qh = logging.handlers.QueueHandler(self.q)
        self.file = logging.FileHandler('log.log')
        self.file.setLevel('DEBUG')
        self.logger.addHandler(self.qh)
        self.logger.addHandler(self.file)
        logging.basicConfig(level='INFO')
# builder.py

import logging
from multiprocessing import Process
import threading

class Build:
    def __init__(self, q):
        self.queue = q
        self.logger = logging.getLogger('dcre')

        workers = []
        for i in range(5):
            wp = Process(target=self.foo, args=(i,))
            workers.append(wp)
            wp.start()

            lp = threading.Thread(target=self.logger_thread)
            lp.start()
            for wp in workers:
                wp.join()

            self.queue.put(None)
            lp.join()

    def logger_thread(self):
        while True:
            record = self.queue.get()
            if record is None:
                break
            self.logger.handle(record)

    def foo(self, i):
        msgs = (
            (10, "This is a DEBUG message. You shouldn't see this."),
            (20, 'This is an INFO message. Just so you know.'),
            (30, 'This is a WARNING message. Be careful, yo.'),
            (40, 'This is an ERROR message. Man, you done messed up.'),
            (50, 'This is a CRITICAL message. Game over!')
        )
        self.logger.log(*msgs[i])

Note: The configure method for the logger only existed to delay the configuration until after the GUI was created so that it had access to a Text widget for a custom handler to write to.


Solution

  • Sharing a logger object between processes can get weird fast.

    I would recommend using the QueueHandler class as you have started to do this and then use a toplevel logging. Then set all the configuration settings on the main Queue logger. This works in python 3.11.9

    For example

    import multiprocessing
    import logging
    import logging.config
    from logging.handlers import QueueListener
    from logging import StreamHandler
    import atexit
    
    STD_FORMAT_STR = r"[%(levelname)s @ %(name)s|%(filename)s|L%(lineno)d|%(asctime)s|%(process)d]: %(message)s"
    STD_DATE_FORMAT_STR = "%Y-%m-%dT%H:%M:%S%z"
    
    def create_share_logging_queue():
      ### CREATE THE SHARED LOGGING QUEUE
      loggingQueue = multiprocessing.Queue()
    
      ## CREATE HANDLERS ###
      standardFormat = logging.Formatter(fmt=STD_FORMAT_STR,
                                           datefmt=STD_DATE_FORMAT_STR)
      _hndlr = StreamHandler()
      _hndlr.setLevel("DEBUG")
      _hndlr.setFormatter(standardFormat)
    
      ### CREATE AND START QUEUE LISTENER ###
      queueListener = QueueListener(loggingQueue,_hndlr,respect_handler_level=True)
      queueListener.start()
      atexit.register(queueListener.stop) # ensures that queue resources are cleared when dying
      return loggingQueue
    
    def setup_mp_device_queued_logging(loggingQueue):
        """
            1. Take in the shared mp.Queue that all devices write to
            2. Create a QueueHandler that log records will push to
        Args:
            loggingQueue (_type_): _description_
        """
        loggingDictConfig  = {
            "version": 1,
            "disable_existing_loggers": True,
            
            "formatters" : {
            },
            "handlers": {
                "myQueueHandler" : {
                    "class" : "logging.handlers.QueueHandler",
                    "queue" : loggingQueue
                }
            },
            "root" : {
                    "level": "DEBUG",
                    "handlers" : [
                           "myQueueHandler"
                    ]
                },
        }
        ### CREATE THE QUEUEHANDLER + CONFIGURE LOGGING ###
        logging.config.dictConfig(loggingDictConfig)
    
    def someFoo(myQ : multiprocessing.Queue, nm : str):
       setup_mp_device_queued_logging(myQ)
       logger = logging.getLogger(f"{nm}")
       logger.debug(f"Hello from {nm}")
       pass
    
    if __name__ == "__main__":
        myQ = create_share_logging_queue()
        setup_mp_device_queued_logging(myQ)
        logger = logging.getLogger("MAIN")
    
        p1 = multiprocessing.Process(target=someFoo, args=(myQ,"CoolProc1"))
        p2 = multiprocessing.Process(target=someFoo, args=(myQ,"CoolProc2"))
        p1.start()
        p2.start()
        logger.debug("Hello From Main")
        p1.join()
        p2.join()
    

    OUT:

    [DEBUG @ MAIN|test.py|L72|2024-09-05T16:46:42-0500|26284]: Hello From Main
    [DEBUG @ CoolProc1|test.py|L60|2024-09-05T16:46:42-0500|11940]: Hello from CoolProc1
    [DEBUG @ CoolProc2|test.py|L60|2024-09-05T16:46:42-0500|19016]: Hello from CoolProc2
    

    NOTES:

    VALID_LOGGERS = ['MAIN','CoolProc1','CoolProc2']
    disabled_logger_cnt = 0
        for log_name in logging.Logger.manager.loggerDict:
            if not log_name in VALID_LOGGERS:
                log_obj = logging.getLogger(log_name)
                log_obj.setLevel(logging.WARNING)
                disabled_logger_cnt += 1
    

    === TLDR ===

    Don't share loggers between processes. Do use Queues to pass logging messages to a main logger. Probably make a function to configure the logger and only configure and get loggers once per process.