I have this situation:
tasks.py
@task
def add(a,b):
return a+b
@task
def other():
chunks = [1,1,1,1] # dummy data
for index in range(3):
# wait for each group to finish then continue to the next one
res = group( add.s(i,i) for i in chunks ).apply_async()
# sleep for 1 second if group is not ready
while not res.get():
time.sleep(1)
Could this lead to a deadlock while waiting for the group of tasks to finish ? Even in the theoretical situation of having only 1 celery worker ?
You are waiting for the result of group
task inside other
task. So it might lead to a dead lock even with one celery worker.
Having a task wait for the result of another task is really inefficient, and may even cause a deadlock if the worker pool is exhausted.
Note: This just gives a warning in Celery 3.1. But from Celery 3.2 onwards it will raise an exception.
So, it is better to make your design asynchronous. You can do it with a simple modification.
@task
def other():
chunks = [1, 1, 1, 1]
my_tasks = []
for i in range(3):
# delay is shorthand for apply_async.
# using si to make signature immutable,so that its arguments don't change
group_task = group(add.si(i, i) for i in chunks).delay()
# here instead of executing them immediately, lets chain them
my_tasks.append(group_task)
from celery import chain
chain(my_tasks)