I have a Django application that has large I/O-bound tasks.
I use Celery to run these tasks in threads and manage the progress in the UI with a progress bar.
Here's my configuration :
Django version : 5.0.2
Celery version : 5.3.6
Redis version : Redis for Windows 5.0.14.1 (https://github.com/tporadowski/redis/releases)
SERVER
Windows Server 2016 (can't change that; I have data stored in an Access Database)
Hosting app in IIS default AppPool
Processor : 4 core
RAM : 4 GB
web.config configuration :
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<system.webServer>
<handlers>
<add name="Python FastCGI" path="*" verb="*" modules="FastCgiModule" scriptProcessor="C:\Python311\python.exe|C:\Python311\Lib\site-packages\wfastcgi.py" resourceType="Unspecified" requireAccess="Script" />
</handlers>
<directoryBrowse enabled="true" />
</system.webServer>
<appSettings>
<add key="PYTHONPATH" value="C:\inetpub\Django-LIAL\WEBAPPLIAL" />
<add key="WSGI_HANDLER" value="WEBAPPLIAL.wsgi.application" />
<add key="DJANGO_SETTINGS_MODULE" value="WEBAPPLIAL.settings" />
</appSettings>
</configuration>
Django wsgi configuration :
from gevent import monkey
monkey.patch_all()
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'WEBAPPLIAL.settings')
application = get_wsgi_application()
Django celery configuration :
#Celery setting
CELERY_BROKER_URL = 'redis://127.0.0.1:6379/0'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_BACKEND = 'django-db'
CELERY_CACHE_BACKEND = 'django-cache'
CELERY_TASK_ALWAYS_EAGER = False
CELERY_TASK_TRACK_STARTED = True
celery command line launched in git :
$ celery -A WEBAPPLIAL worker -l info -P gevent
*** what the celery command line do : ***
-------------- celery@WIN-RHK2AHPNGJ1 v5.3.6 (emerald-rush)
--- ***** -----
-- ******* ---- Windows-10-10.0.14393-SP0 2024-05-17 12:05:49
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: WEBAPPLIAL:0x17207492650
- ** ---------- .> transport: redis://127.0.0.1:6379/0
- ** ---------- .> results:
- *** --- * --- .> concurrency: 4 (gevent)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. APPLICATION.A13.A13_LOG_0002.model.task.extract_data
. APPLICATION.A13.A13_LOG_0005.tasks.launch_app
. WEBAPPLIAL.celery.debug_task
[2024-05-17 12:05:49,995: WARNING/MainProcess] C:\Python311\Lib\site-packages\celery\worker\consumer\consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
warnings.warn(
[2024-05-17 12:05:50,010: INFO/MainProcess] Connected to redis://127.0.0.1:6379/0
[2024-05-17 12:05:50,010: WARNING/MainProcess] C:\Python311\Lib\site-packages\celery\worker\consumer\consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
warnings.warn(
[2024-05-17 12:05:50,026: INFO/MainProcess] mingle: searching for neighbors
[2024-05-17 12:05:51,048: INFO/MainProcess] mingle: all alone
[2024-05-17 12:05:51,048: WARNING/MainProcess] C:\Python311\Lib\site-packages\celery\worker\consumer\consumer.py:507: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
warnings.warn(
[2024-05-17 12:05:51,048: INFO/MainProcess] pidbox: Connected to redis://127.0.0.1:6379/0.
[2024-05-17 12:05:51,063: INFO/MainProcess] celery@WIN-RHK2AHPNGJ1 ready.
Quick look at my function:
@shared_task(bind=True)
def launch_app(self, laiteries, formated_date):
@shared_task(bind=True)
def extract_data(self, date_start, date_end):
They are both called with .delay()
Each function interacts with the Django ORM but on a different model.
Actual behaviour
Then, when I launch my first function (by interacting with the web app) and immediately launch the second function, this is what happens:
[2024-05-17 12:06:28,464: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0002.model.task.extract_data[baf19fc9-dd9c-4574-af8d-c7ed9a522c0e] received
[2024-05-17 12:06:56,144: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0002.model.task.extract_data[baf19fc9-dd9c-4574-af8d-c7ed9a522c0e] succeeded in 27.60899999999998s: 'Proc▒dure termin▒e !'
[2024-05-17 12:06:56,159: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0005.tasks.launch_app[435df153-9879-47a4-93ba-5ba9ed90cf76] received
[2024-05-17 12:07:01,662: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0005.tasks.launch_app[435df153-9879-47a4-93ba-5ba9ed90cf76] succeeded in 5.5s: 'Tout les emails ont bien ▒t▒ envoyer !'
Problems : The problem is that Celery performs tasks sequentially and not in parallel.
My expected behavior would be something like this:
[2024-05-17 12:06:28,464: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0002.model.task.extract_data[baf19fc9-dd9c-4574-af8d-c7ed9a522c0e] received
[2024-05-17 12:06:29,159: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0005.tasks.launch_app[435df153-9879-47a4-93ba-5ba9ed90cf76] received
[2024-05-17 12:07:34,662: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0005.tasks.launch_app[435df153-9879-47a4-93ba-5ba9ed90cf76] succeeded in 5.5s: 'Tout les emails ont bien ▒t▒ envoyer !'
[2024-05-17 12:06:56,144: INFO/MainProcess] Task APPLICATION.A13.A13_LOG_0002.model.task.extract_data[baf19fc9-dd9c-4574-af8d-c7ed9a522c0e] succeeded in 27.60899999999998s: 'Proc▒dure termin▒e !'
If you need any more details, please ask!
Somehow, switching from gevent
to threads
resolves the problem.
$ celery -A WEBAPPLIAL worker -l info -P threads 100