airflowgoogle-cloud-composer

Updating a Variable doesn't seem to have any effect


I am running a DAG which attempts to update a Secret. The Secret is stored in the GCP Secret Manager. Code looks like this:

    Variable.set(key=secret_name, value=mapping_list)

I get this warning message:

[2024-02-23 19:45:42.627489+00:00] {variable.py:245} WARNING - The variable my_var is defined in the CloudSecretManagerBackend secrets backend, which takes precedence over reading from the database. The value in the database will be updated, but to read it you have to delete the conflicting variable from CloudSecretManagerBackend

How can I ensure that the new value I set gets propagated to the backend?


Ok, so I here's what happened: the first time I ran my code, it tried to call Variable.set. The DAG ended successfully, no messages, no warnings, but also: NO NEW Secret in the Secrets backend. So I thought that maybe Airflow couldn't create, for some reason, so I created the Secret manually.

Then when I reran the code, I started getting messages like above. Today I noticed that in the Airflow UI, in the "Variables" page, under Admin, there was a new variable, with the correct name and value. That explains the messages about the conflict! There was the same key in the Airflow database as the Secrets backend!

I deleted the variable in the Airflow database, reran my DAG, and I STILL get the same warning as above, but now I also get an error that causes the DAG to fail when I call Variable.set:

Variable {key} does not exist in the Database and cannot be updated.

So now I really don't know why I'm getting this error: the variable should be cleared from the database and only exist in the backend.


Solution

  • You are experiencing a case of key collision as explained in the search path doc of Secret Backend.

    Before Airflow updates a Variable value it checks for conflicts (see source code), My recommendation is to avoid cases where the same Variable key is stored both in Airflow DB and in Secret backend. It doesn't make much sense to maintain the two versions of the same key as it raises question of what is the source of truth for the data stored with this key?

    Either manage the key as Airflow Variable or as key in Cloud Secret Manager but not in both. The simple solution is to remove one of them then the error will not be shown any more.

    EDIT:

    To clarify, Variable.set() will always create the variable in Airflow metasotre. It will not write to secret backend. This is a design choice of Airflow. The reasoning behind it is that secret stores normally don't allow to write keys for users. You need special permissions for writing (Admin etc...)

    You encounter the error because you used Variable.set(key="my_Var", ...) Airflow checks if the value exist in the secret store (because read access is allowed) and warns you that same key already exist, thus when you will do Variable.get("my_Var") the value from secret store will be returned and the Airflow Variable will be ignored.

    I raised https://github.com/apache/airflow/pull/37814 to improve the Airflow docs around the Variable.set() when working with secret backend.