I am currently using Airflow Taskflow API 2.0. I am having an issue of combining the use of TaskGroup and BranchPythonOperator.
Below is my code:
import airflow
from airflow.models import DAG
from airflow.decorators import task, dag
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import BranchPythonOperator, PythonOperator
from airflow.operators.python import task, get_current_context
from random import randint
from airflow.utils.task_group import TaskGroup
default_args = {
'owner': 'Airflow',
'start_date': airflow.utils.dates.days_ago(2),
}
@task
def dummy_task():
return {}
@task
def task_b():
return {}
@task
def task_c():
return {}
def final_step():
return {}
def get_tasks(**kwargs):
task = 'task_a'
return task
with DAG(dag_id='branch_dag',
default_args=default_args,
schedule_interval=None) as dag:
with TaskGroup('task_a') as task_a:
obj = dummy_task()
tasks = BranchPythonOperator(
task_id='check_api',
python_callable=get_tasks,
provide_context=True
)
final_step = PythonOperator(
task_id='final_step',
python_callable=final_step,
trigger_rule='one_success'
)
b = task_b()
c = task_c()
tasks >> task_a >> final_step
tasks >> b >> final_step
tasks >> c >> final_step
When i trigger this DAG, i get the below error inside the check_api task:
airflow.exceptions.TaskNotFound: Task task_a not found
Is it possible to get this working and using TaskGroup in conjunction with BranchPythonOperator?
Thanks,
BranchPythonOperator
is expected to return task_ids
You need to change the get_tasks
function to:
def get_tasks(**kwargs):
task = 'task_a.dummy_task'
return task