I am trying to learn in a very simple way how luigi works. Just as a newbie I came up with this code
import luigi
class class1(luigi.Task):
def requires(self):
return class2()
def output(self):
return luigi.LocalTarget('class1.txt')
def run(self):
print 'IN class A'
class class2(luigi.Task):
def requires(self):
return []
def output(self):
return luigi.LocalTarget('class2.txt')
if __name__ == '__main__':
luigi.run()
Running this in command prompt gives error saying
raise RuntimeError('Unfulfilled %s at run time: %s' % (deps, ',', '.join(missing)))
which is:
RuntimeError: Unfulfilled dependency at run time: class2__99914b932b
This happens because you define an output for class2
but never create it.
Let's break it down...
When running
python file.py class2 --local-scheduler
luigi will ask:
class2
already on disk? NOclass2
: NONErun
method (by default it's and empty method pass
)However, when running
python file.py class1 --local-scheduler
luigi will:
class1
already on disk? NOclass2
class2
on disk? NOclass2
-> running -> done without errorsclass2
on disk? NO -> raise errorluigi never runs a task unless all of its previous dependencies are met. (i.e. their output is on the file system)