airflow

Debugging Broken DAGs


When the airflow webserver shows up errors like Broken DAG: [<path/to/dag>] <error>, how and where can we find the full stacktrace for these exceptions?

I tried these locations:

/var/log/airflow/webserver -- had no logs in the timeframe of execution, other logs were in binary and decoding with strings gave no useful information.

/var/log/airflow/scheduler -- had some logs but were in binary form, tried to read them and looked to be mostly sqlalchemy logs probably for airflow's database.

/var/log/airflow/worker -- shows up the logs for running DAGs, (same as the ones you see on the airflow page)

and then also under /var/log/airflow/rotated -- couldn't find the stacktrace I was looking for.

I am using airflow v1.7.1.3


Solution

  • Usually I use the command airflow list_dags which prints the full stacktrace for a Python error found in the DAGs.

    This approach will work with almost any Airflow command as Airflow parses the dags folder each time you use an Airflow CLI command.