python-3.xgoogle-sheetsscripting

How do I iterate over all the sheets in a google sheet using python 3?


I'm trying to read the data from a Google spreadsheet using Python 3. All of my spreadsheets have multiple sheets. I need to get all the data from only the individual sheets that have "Log" in their title.

I've found only examples of processing a spreadsheet with a single sheet called "Sheet1". Below is the code that I have that is able to print the names of the sheets whose title begins with "Log". However, I can't see how to read all the rows and do something useful with that data.

#! /usr/bin/env python3
# -*- mode: python; coding: utf-8 -*-

import os.path
import pickle
import re
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/spreadsheets.readonly']

my_spreadsheets = ['sheet id goes here',  # add more spreadsheet ids after the code works for one SS.         
    ]

def get_creds():
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('sheets', 'v4', credentials=creds)
    return service    

def main():
    service = get_creds()
    # Iterate over the list of spreadsheets (spreadsheet Ids)
    log_RE = re.compile('^Log ')
    for spreadsheetId in my_spreadsheets:
        sheet_metadata = service.spreadsheets().get(spreadsheetId=spreadsheetId).execute()
        sheets = sheet_metadata.get('sheets', '')
        # Iterate over each of the sheets.
        for sheet in sheets:
          title = sheet.get("properties", {}).get("title", "Sheet1")
          if re.search(log_RE, title):
            sheet_id = sheet.get("properties", {}).get("sheetId", 0)
            print(title)
             # read all the rows in the sheet and process them here
             # process the data from all the rows and do something useful here.

if __name__ == "__main__":
    main()

Solution

  • You could use the Sheetfu library or any other python library api to make things easy for you.

    With Sheetfu, you could simply get the rows data as follow:

    from sheetfu import SpreadsheetApp
    
    
    spreadsheet = SpreadsheetApp('path/to/secret.json').open_by_id('<insert spreadsheet id here>')
    
    for sheet in spreadsheet.sheets:
        if not sheet.name.startswith("log"):
            # we skip sheets that are not logs
            continue
        data_range = sheet.get_data_range()
        values = data_range.get_values()          
        for row in values:
            print(row)
    

    If your data is in the form of tables (with header), you should into working with the Sheetfu Table module, as shown in the README.