amazon-web-servicesairflowmwaa

aws managed airflow: resources (vCPU & memory) shared between workers or not?


I would like to deploy an airflow cluster with aws managed airflow. aws offers different environment classes for airflow, e.g. "mw1.small" with 1 vCPU and 2 GB RAM [1]. Moreover, one can set a "minimum worker count" and a "maximum worker count".

I assume that the cluster automatically scales up and down the number of workers between the minimum and maximum worker count, and that each active airflow DAG occupies one "worker" (although I'm not sure about this, the aws documentation is cryptic to me).

What I don't understand: Does each worker have the resources specified by the environment class (e.g. 1 vCPU and 2 GB RAM per worker), or are the resources of the environment class shared between all workers?

For example, in a "mw1.small" environment, if I have 5 DAGs running in parallel (presumably on 5 workers?), does each worker have access to 1 vCPU and 2 GB RAM?

[1] https://docs.aws.amazon.com/mwaa/latest/userguide/environment-class.html


Solution

  • Each MWAA worker is allocated vCPU and memory based on the environment size. For instance, a small environment is equipped with 1 vCPU and 2GB of RAM.

    Additionally, there is a setting called celery.worker_autoscale that specifies the maximum and minimum number of tasks that can run concurrently on any given worker. For a small environment this is set to 5,5 by default.

    For more details, refer to the AWS documentation.