rgooglesheets4gargle

Googlesheets quota limit issues - possible failure to use API key


We are currently using google sheets for a research project on crowd forecasts for Covid-19 case and death numbers.

Google Sheets is used for convenience, but we are often running into quota limit issues - even though the number of users we have should be well below what Google allows.

I attempted to create a somewhat reproducible example by setting up a new google account and creating a sheet from which to read.

The first thing I tried (without making any changes to the google account) is this:

library(googledrive)
library(googlesheets4)

# Google sheets authentification -----------------------------------------------
options(gargle_oauth_cache = ".secrets")
drive_auth(cache = ".secrets", email = "iamatestotest@gmail.com")
gs4_auth(token = drive_token())

sheet_id <- "1Z2O5Mce_haceWfduLenJQP-hddXF9biY_4Ydob_psyQ"

n_tries <- 50

for (i in 1:n_tries) {
  data <- read_sheet(ss = sheet_id)
  Sys.sleep(0.5)
  print(i)
}

From what I understand I should be able to make around 300 read requests per minute, but I'm usually not be able to get the loop to run beyond 30-34.

As I wasn't sure the 300 requests are readily available I went to https://console.cloud.google.com, created a new test project (not sure why that is needed) and explicitly activated the googlesheets API and created some credentials. I created an API key as well as an OAuth 2.0 Client ID (although I am admittedly somewhat lost what this does and how to use it).

API quotas

I next tried to login with my api key by running

drive_deauth()
drive_auth_configure(api_key = "thisismyapikey")
gs4_auth(token = drive_api_key())

but that also didn't get me beyond 33ish. I also had a look into the google console, but also couldn't see any traffic - so not sure my API key got actually used?

API usage graphs

I assume this is due to my inability to actually use the API in the intended way. Any help in setting this up / increasing the quota would be much appreciated. If that helps I'm happy to give access to the test account - simply write me a message.


Solution

  • With some kind help from very friendly people I think I mostly figured this out and it was indeed my failure to use the API correctly.

    Why my approach failed

    when you use googlesheets4 and any of its function out of the box, you get asked to authorize the tidyverse API OAuth app (you login with your Google credentials and give the OAuth app access rights). This means that you make all requests through the tidyverse OAuth app, as are all other users in the world who use this functionality. This is very nice as it works out of the box, but runs into limitations if other people are using the package at the same time. Sharing this quota with other people meant that I ran into limitations quite unpredictably.

    How to change the setup to make it work

    There are a couple of things that help to alleviate / solve the problem.

    1. use the devtools version of googlesheets4 (devtools::install_github("tidyverse/googlesheets4"). This dev version of googlesheets4 in turn relies on the dev version of gargle, the package that manages the google authentification. The dev version of gargle has a retry function, that automatically retries your requests if they fail. This should solve the majority of issues.

    2. Get your own OAuth app / google service account. this allows you to manage the authentification process all on your own. You therefore don't have to share your quota with other users around the world. To set up your own OAuth app / google service account, you can do the following (I'm focusing on the google service account here, as that is much easier in practice).

    You should now also be able to track the API requests in the google console dashboard.

    Note that there is still a limit of 60 requests per user per minute, so you're not getting your full 300 requests, but maybe it is possible to create several service accounts and balance the load between these. But not having other people's request interfere with yours is a big improvement!