dockerluigi

Luigi: Running Tasks in a Docker Image


I am testing Luigi's abilities for running tasks in a Docker container, i.e. I would like Luigi to spawn a container from a given Docker image and execute a task therein.

As far as I understood, there is luigi.contrib.docker_runner.DockerTask for this purpose. I tried to modify its command and add an output:

import luigi
from luigi.contrib.docker_runner import DockerTask

class Task(DockerTask):
    def output(self):
        return luigi.LocalTarget("bla.txt")
    def command(self):
        return f"touch {self.output()}"

if __name__ == "__main__":
luigi.build([Task()], workers=2, local_scheduler=True) 

But I am getting

enter image description here

It seems that there is a TypeError in docker. Is my use of DockerTask erroneous? Unfortunately, I cannot find any examples for use cases...


Solution

  • Here the answer from my reply to your email:

    luigi.LocalTarget doesn't return a file path or string, but a target object, therefore f"touch {self.output()}" isn't a valid shell command.

    Second, I just looked into the documentation of DockerTask. https://luigi.readthedocs.io/en/stable/_modules/luigi/contrib/docker_runner.html#DockerTask command is should be a property, not a method. This explains your error message, the code calls "Task().command", which is a function, and not a string. A solution might be to use a static attribute for the output name

    class Task(DockerTask):
        output_name = "bla.txt"
    
        def output(self):
            return luigi.LocalTarget(self.output_name)
    
        command = f"touch {self.output_name}"
    

    or to use a property decorator:

    class Task(DockerTask):
    
        def output(self):
            return luigi.LocalTarget("bla.txt")
    
        @property
        def command(self):
            target = self.output()  # not a string/fpath
            return f"touch {target.path}"
    

    though I'm not sure whether this would work, I'd prefer the first option.

    In the future, make sure to check the documentation for the base class that you're implementing, e.g. checking if an attribute is a method or property, maybe use a debugger to understand what's happening.

    And for the sake of searchability and saving data and bandwidth, in the future try to post error logs not as screenshots but in text, either as a snippet or via a link to a pastebin or something like that.