Is it the actual processor in which Celery is running or is it another process? In flower, I could see multiple processes in a worker pool? What are the differences between these two?
It turns out that Celery nodes are indirectly documented here:
In short Celery uses a set of terms that are useful to understand when building a system of distributed work.
Terms around those that help to plan things include:
At this point, take note that the Client, Broker and Worker can all be on different machines, and there can in fact be multiple Clients on different machines and multiple Workers on different machines, as long as they use the same Broker.
It should be no surprise then, that the application typically has the Broker configured with a URL. That is all the Applications, in all the Clients and Workers are all using the same Broker URL and hence are all using the same Broker.
The Clients send (produce) messages via the broker, requesting tasks to run, the Workers read (consume) those messages.
Now these terms all have a place:
Each Worker can process multiple tasks at once, by maintaining an execution pool. This pool might be threads, or (by default) it is subprocesses. So a Worker may have a number of Pool processes as children.
One of the frustrations (I have) with Celery is that you can communicate liberally with Workers but not with the running tasks in a Worker's Execution Pool (for which reason I am creating a new Task class for interactive tasks, but it's still evolving).
A Node is just a Worker in a Cluster. In short Node = Worker. A Cluster is a number of Workers running in parallel (using celery multi
as per the document I introduced with). A Cluster is just a convenient way of starting and stopping and managing multiple workers on the same machine.
There may be many Clusters all consuming tasks from the same Broker though and they may be on the same machine (though one would wonder why) or on different machines.
And that is what a Celery Node is ... (in its fullest context).