pythonauthenticationgoogle-cloud-platformgoogle-apigoogle-drive-api

How to authenticate Google APIs (Google Drive API) from Google Compute Engine and locally without downloading Service Account credentials?


Our company is working on processing data from Google Sheets (within Google Drive) from Google Cloud Platform and we are having some problems with the authentication.

There are two different places where we need to run code that makes API calls to Google Drive: within production in Google Compute Engine, and within development environments i.e. locally on our developers' laptops.

Our company is quite strict about credentials and does not allow the downloading of Service Account credential JSON keys (this is better practice and provides higher security). Seemingly all of the docs from GCP say to simply download the JSON key for a Service Account and use that. Or Google APIs/Developers docs say to create an OAuth2 Client ID and download it’s key like here.
They often use code like this:

from google.oauth2 import service_account

SCOPES = ['https://www.googleapis.com/auth/sqlservice.admin']
SERVICE_ACCOUNT_FILE = '/path/to/service.json'

credentials = service_account.Credentials.from_service_account_file(
        SERVICE_ACCOUNT_FILE, scopes=SCOPES)

But we can't (or just don't want to) download our Service Account JSON keys, so we're stuck if we just follow the docs.

For the Google Compute Engine environment we have been able to authenticate by using GCP Application Default Credentials (ADCs) - i.e. not explicitly specifying credentials to use in code and letting the client libraries “just work” - this works great as long as one ensures that the VM is created with the correct scopes https://www.googleapis.com/auth/drive, and the default compute Service Account email is given permission to the Sheet that needs to be accessed - this is explained in the docs here. You can do this like so;

from googleapiclient.discovery import build
service = build('sheets', 'v4')
SPREADSHEET_ID="<sheet_id>"
RANGE_NAME="A1:A2"
s = service.spreadsheets().values().get(
    spreadsheetId=SPREADSHEET_ID,
    range=RANGE_NAME, majorDimension="COLUMNS"
).execute()

However, how do we do this for development, i.e. locally on our developers' laptops? Again, without downloading any JSON keys, and preferably with the most “just works” approach possible?

Usually we use gcloud auth application-default login to create default application credentials that the Google client libraries use which “just work”, such as for Google Storage. However this doesn't work for Google APIs outside of GCP, like Google Drive API service = build('sheets', 'v4') which fails with this error: “Request had insufficient authentication scopes.”. Then we tried all kinds of solutions like:

credentials, project_id = google.auth.default(scopes=["https://www.googleapis.com/auth/drive"])

and

credentials, project_id = google.auth.default()
credentials = google_auth_oauthlib.get_user_credentials(
    ["https://www.googleapis.com/auth/drive"], credentials._client_id, credentials._client_secret)
)

and more... Which all give a myriad of errors/issues we can’t get past when trying to do authentication to Google Drive API :(

Any thoughts?


Solution

  • One method for making the authentication from development environments easy is to use Service Account impersonation.

    Here is a blog about using service account impersonation, including the benefits of doing this. @johnhanley (who wrote the blog post) is a great guy and has lots of very informative answers on SO also!

    To be able to have your local machine authenticate for Google Drive API you will need to create default application credentials on your local machine that impersonates a Service Account and apply the scopes needed for the APIs you want to access.

    To be able to impersonate a Service Account your user must have the role roles/iam.serviceAccountTokenCreator. This role can be applied to an entire project or to an individual Service Account.

    You can use the gcloud to do this:

    gcloud iam service-accounts add-iam-policy-binding [COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL] \
    --member user:[USER_EMAIL] \
    --role roles/iam.serviceAccountTokenCreator
    

    Once this is done create the local credentials:

    gcloud auth application-default login \
    --scopes=openid,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/accounts.reauth \
    --impersonate-service-account=[COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL]
    

    This will solve the scopes error you got. The three extra scopes added beyond the Drive API scope are the default scopes that gcloud auth application-default login applies and are needed.

    If you apply scopes without impersonation you will get an error like this when trying to authenticate:

    HttpError: <HttpError 403 when requesting https://sheets.googleapis.com/v4/spreadsheets?fields=spreadsheetId&alt=json returned "Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the sheets.googleapis.com. We recommend configuring the billing/quota_project setting in gcloud or using a service account through the auth/impersonate_service_account setting. For more information about service accounts and how to use them in your application, see https://cloud.google.com/docs/authentication/.">
    

    Once you have set up the credentials you can use the same code that is run on Google Compute Engine on your local machine :)

    Note: it is also possible to set the impersonation for all gcloud commands:

    gcloud config set auth/impersonate_service_account [COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL]
    

    Creating default application credentails on your local machine by impersonating a service account is a slick way of authenticating development code. It means that the code will have exactly the same permissions as the Service Account that it is impersonating. If this is the same Service Account that will run the code in production you know that code in development runs the same as production. It also means that you never have to create or download any Service Account keys.