airflowmanagedmwaa

AWS MWAA (Managed Apache Airflow) where to put the python code used in the dags?


Where do you put your actual code? The dags must be thin, this assumes that when the task starts to run it would do the imports, and run some python code.

When we were on the standalone airflow I could add to the PYTHON_PATH my project root and do the imports from there, but in the AWS managed airflow I don't find any clues.


Solution

  • Put your DAGs into S3. Upon initialization of your MWAA environment, you will determine the S3 bucket containing your code.

    E.g., create a bucket <my-dag-bucket> and place your DAGs in a subfolder dags

    s3://<my-dag-bucket>/dags/ 
    

    Also make sure to define all python dependencies in a requirements file and put that one in the same bucket as well:

    s3://<my-dag-bucket>/requirements.txt
    

    Finally, if you need to provide own modules, zip them up and put the zip file in the bucket, too:

    s3://<my-dag-bucket>/plugins.zip
    

    See https://docs.aws.amazon.com/mwaa/latest/userguide/get-started.html