pythonpandasgoogle-sheetsgoogle-sheets-apigspread

GSPREAD bulk export of dataframes on separate sheets without triggering API write request limits


I am running up against the gsheet api quota for write requests when writing some relatively small dataframes to a single spreadsheet.

The dataframes are held in a dictionary. The KEY is used for the SHEET name and the VALUE is the dataframe to be exported.

Right now I'm running through the dataframes in a loop, creating the SHEET and saving the data.

gc = gspread.oauth(credentials_filename=credential_path("credentials.json"),
                   authorized_user_filename=credential_path("token.json")
                   )

gc_sh = gc.create('rote_output_' + yyyymmddhhss())


# (rote_data is the dictionary of dataframes)
for k, v in rote_data.items():
    gc_sh.add_worksheet(title=k, rows=100, cols=20)
    worksheet = gc_sh.worksheet(k)
    worksheet.update([v.columns.values.tolist()] + (v.fillna('')).values.tolist())
    worksheet.freeze(rows=1)

It feels like the four 'operations' in the loop are being treated as separate calls and as such, if I have 15 dataframes, I hit 60 write requests instantly, triggering the api block.

Is there a way to conduct the updates as a 'bulk' request so as not to hit these limits? The batch_update() documentation is light on detail and I'm not sure that it is possible to create and then save down to different sheets using this formula, which would be ideal.

** EDIT: How to create multiple sheets in a spreadsheet by gspread and python?

From this question/answer it appears that batch_update might be able to do this although it's not clear how I would combine the request to create a sheet, with the actual updating of the data on each sheet

client =  # Please use your client.

employee_names = ["Jonny", "Emma",,,] # Please set the sheet names. spreadsheetId = "###" # Please set your spreadsheet ID.

requests = [
    {
        "addSheet": {
            "properties": {
                "title": e,
                "gridProperties": {"rowCount": 100, "columnCount": 20},
            }
        }
    }
    for e in employee_names ]  
spreadsheet = client.open_by_key(spreadsheetId)
spreadsheet.batch_update({"requests": requests})

Solution

  • I believe your goal is as follows.

    In this case, how about the following modification?

    Modified script:

    gc_sh = gc.create('rote_output_' + yyyymmddhhss())
    
    # --- I modified the below script.
    # Create request bodies.
    request1 = []
    request2 = []
    for k, v in rote_data.items():
        request1.append({"addSheet": {"properties": {"title": k, "gridProperties": {"rowCount": 100, "columnCount": 20, "frozenRowCount": 1}}}})
        request2.append({"range": f"'{k}'!A1", "values": [v.columns.values.tolist()] + (v.fillna('')).values.tolist()})
    
    # Use Sheets API.
    gc_sh.batch_update({"requests": request1})
    gc_sh.values_batch_update({"data": request2, "valueInputOption": "USER_ENTERED"})
    

    Note:

    References: