pythongoogle-cloud-platformgoogle-drive-apiairflowgoogle-sheets-api

Edit Google Drive and Sheets Files (Airflow Google Provider): Insufficient credentials


I am trying to modify a Google Sheets file and a CSV file in Google Drive via the Apache Airflow Google Provider:

df = pd.DataFrame(data)
csv_data = df.to_csv(index=True)

gcs_hook = GCSHook(gcp_conn_id=GOOGLE_CONNECTION)
gcs_hook.upload(
     bucket_name=GOOGLE_CLOUD_BUCKET_NAME,
     object_name=csv_name,
     data=csv_data,
)

sheets_hook = GSheetsHook(gcp_conn_id=GOOGLE_CONNECTION)
sheets_hook.clear(GOOGLE_SHEET_ID, range_=sheet_name)
sheets_hook.append_values(
   GOOGLE_SHEET_ID,
   range_=sheet_name,
   values=[df.columns.tolist()] + df.values.tolist(),
)

The string GOOGLE_CONNECTION is just the name of the google connection that I've defined using the Apache Airflow GUI. This connection points to a credentials.json file (the file exists and is discovered), which I obtained in the following way:

Nevertheless, the operations are unsuccessful:

[2025-04-27, 10:09:26] ERROR - Task failed with exception: source="task"
HttpError: <HttpError 403 when requesting https://sheets.googleapis.com/v4/spreadsheets/1qeNnfd74h6EgjAZkYPWKb3BzYGW5ZZ8h9UyuGvL3_PQ/values/countries:clear?alt=json returned "Request had insufficient authentication scopes.". Details: "[{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'ACCESS_TOKEN_SCOPE_INSUFFICIENT', 'domain': 'googleapis.com', 'metadata': {'method': 'google.apps.sheets.v4.SpreadsheetsService.ClearValues', 'service': 'sheets.googleapis.com'}}]">

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", line 825 in run

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", line 1088 in _execute_task

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/bases/operator.py", line 408 in wrapper

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/standard/operators/python.py", line 212 in execute

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/standard/operators/python.py", line 235 in execute_callable

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/callback_runner.py", line 81 in run

File "/opt/airflow/dags/ingest_and_save.py", line 139 in export_to_gsheet

File "/opt/airflow/dags/ingest_and_save.py", line 127 in write_table_to_Google

File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/google/suite/hooks/sheets.py", line 335 in clear

File "/home/airflow/.local/lib/python3.12/site-packages/googleapiclient/_helpers.py", line 130 in positional_wrapper

File "/home/airflow/.local/lib/python3.12/site-packages/googleapiclient/http.py", line 938 in execute

I searched for an option in the Google Cloud API to set the credentials scope, but couldn't find one. I tried manually appending the last two fields to the credentials.json:

{
  "type": "service_account",
  "project_id": "<my project id>",
  "private_key_id": "<my private key id>",
  "private_key": "<my private key>",
  "client_email": "<my client service account email>",
  "client_id": "<my client id>",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "<client cert url>",
  "universe_domain": "googleapis.com",
  "extra__google_cloud_platform__scopes": ["https://www.googleapis.com/auth/spreadsheets", "https://www.googleapis.com/auth/drive"],
  "scope": ["https://www.googleapis.com/auth/spreadsheets", "https://www.googleapis.com/auth/drive"]
}

However, the error is still the same. What am I doing wrong here?


Solution

  • It turns out everything in Google Cloud has been properly set up and that the Python code works correctly. The problem is that you cannot directly set the scope variable, as I did in the credentials.json file.

    Instead, you should edit the scope manually from the Airflow UI: Admin > Connections > the name of your google connection> Extra Fields > https://www.googleapis.com/auth/spreadsheets,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/devstorage.read_write.
    Feel free to replace the scopes with whatever your needs are. Here the scopes allow authentication for Google Sheets, Google Drive and Google Cloud Storage services respectively. Notice that the scopes are separated by , without any spaces, and there are no quotes around the string.