I am experiencing some problems when trying to pass global parameters to multiple workers. When I run a workflow using multiple workers, the command line parameter values are not being passed. I am wondering if this might be similar to the problem observed when luigi is run on Windows (#2247) but, in my case, on macOS. An additional issue is a difference in log formatting, it seems that logging.cfg
is not passed to the workers. I opened a ticket in github but I got no response (#3236).
Next toy example summarizes my problem.
import luigi
import logging
logger = logging.getLogger('luigi-interface')
class HelloConfig(luigi.Config):
reference = luigi.Parameter(default="World")
class HelloTask(luigi.Task):
def run(self):
logger.info("Hello %s!", HelloConfig().reference)
def requires(self):
return []
I run the task calling
luigi --module weekly_update.etl.load_ex HelloTask \
--workers ? \
--HelloConfig-reference "Mars"
When workers
is 1 the log shows
2023-04-21 10:59:09,386 - luigi-interface - INFO - [MainThread] - Hello Mars!
but when workers
is 2
2023-04-21 10:59:53,030 [INFO]-load_ex.run: Hello World!
As of Python 3.8, MacOS now defaults to using spawn instead of fork, thus having issues that we previously only saw on Windows. You can change the start method using
import multiprocessing
multiprocessing.set_start_method('fork')
I'm not sure there is a better solution, and we're also struggling with command line parameters across multiple workers when not on Linux.