In Apache Airflow, if we set a DAG's catchup to be True, it will schedule all the runs that were not progressed since the start_date. So in case I turn off a DAG and then turn it on 1 year later, it will schedule tons of runs. And I want to avoid this. So is there any way to set a specific interval for catchup? For example, only catchup the runs that are within 1 month in the past from the current time.
Thanks a lot in advance!
DAGs have start_date but they also have optional parameter of end_date.
You should set end_date for your DAG.
DAG(
dag_id='my_dag',
...,
catchup=True,
start_date=datetime(2021, 1, 1),
end_date=datetime(2022, 2, 1),
)