[SOLVED] Worker threads/Queues to process datasets after an upload?

Worker threads/Queues to process datasets after an upload?

I'm writing a web application with Django where users can upload files with statistical data.

The data needs to be processed before it can be properly used (each dataset can take up to a few minutes of time before processing is finished). My idea was to use a python thread for this and offload the data processing into a separate thread.

However, since I'm using uwsgi, I've read about a feature called "Spoolers". The documentation on that is rather short, but I think it might be what I'm looking for. Unfortunately the -Q option for uwsgi requires a directory, which confuses me.

Anyway, what are the best practices to implement something like worker threads which don't block uwsgi's web workers so I can reliably process data in the background while still having access to Django's database/models? Should I use threads instead?

Solution

All of the offloading subsystems need some kind of 'queue' to store the 'things to do'.

uWSGI Spooler uses a printer-like approach where each file in the directory is a task. When the task in done the file is removed. Other systems relies on more heavy/advanced servers like rabbitmq and so on.

Finally, do not directly use the low-level api of the spooler but rely on decorators:

http://projects.unbit.it/uwsgi/wiki/Decorators