pythongoogle-apigmail-apigoogle-api-python-client

How to download attachments from Gmail through Google API using Python


I want to download any attachments I receive from a particular email id within the last 12 hours through the Google API platform using Python. I am using the following code.

import base64
from datetime import datetime, timedelta
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
import json
from google.oauth2.service_account import Credentials


# Load service account credentials from JSON file
creds = Credentials.from_service_account_file(r'token.json',
                                            scopes['https://www.googleapis.com/auth/gmail.readonly'])

# Create Gmail API service
service = build('gmail', 'v1', credentials=creds)

# Calculate the date 12 hours ago from now
now = datetime.utcnow()
time_threshold = now - timedelta(hours=12)
formatted_time_threshold = time_threshold.strftime('%Y-%m-%dT%H:%M:%S.%fZ')

# Define the email address of the sender you want to filter by
sender_email = 'mail@domain.com'

# Define the query to retrieve messages from the sender received in the last 12 hours
query = f'from:{sender_email} after:{formatted_time_threshold}'
print(query)

try:
    # Get messages that match the query
    response = service.users().messages().list(q=query, userId='me').execute()
    messages = response.get('messages', [])
    # Iterate through messages and download attachments
    for msg in messages:
        message = service.users().messages().get(userId='me', id=msg['id']).execute()
        payload = message['payload']

        # Check if message has any attachments
        if 'parts' in payload:
            for part in payload['parts']:
                # Check if part is an attachment
                if part['filename']:
                    filename = part['filename']
                    data = part['body']['data']
                    file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))

                    # Save attachment to local disk
                    with open(filename, 'wb') as f:
                        f.write(file_data)
                    print(f'Saved attachment: {filename}')
except HttpError as error:
    print(f'An error occurred: {error}')

When I run this, I get an HttpError and when I check the error URL, I get the following message.

  "error": {
    "code": 401,
    "message": "Request is missing required authentication credential. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.",
    "errors": [
      {
        "message": "Login Required.",
        "domain": "global",
        "reason": "required",
        "location": "Authorization",
        "locationType": "header"
      }
    ],
    "status": "UNAUTHENTICATED"}

I am not sure what the issue is here. Am I not properly authenticating with the API or do I need to configure something else for this process.

The token.json file I use to establish the connection has the following structure

{
  "type": "service_account",
  "project_id": "mailer-attachments-download",
  "private_key_id": "",
  "private_key": "",
  "client_email": "attachment-downloader@mailer-attachments-download.iam.gserviceaccount.com",
  "client_id": "",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/attachment-downloader%40mailer-attachments-download.iam.gserviceaccount.com"
}

I would like to know what I am doing wrong or are there any other alternatives means through which I can achieve my objective.


Solution

  • Since the error is specifically talking about the authentication credential I believe is fairly safe to assume that the problem is located here:

    creds = Credentials.from_service_account_file(r'token.json',
                                                scopes['https://www.googleapis.com/auth/gmail.readonly'])
    

    Since this is the only line I was able to find that deals with credentials I believe that the problem is that the credentials being sent are actually from the service account itself, the problem with this is that the service account does not by itself have access to anyone else's email data. This can definitely be fixed! And it's done by adding impersonation like so:

    creds = Credentials.from_service_account_file(r'token.json',
                                                scopes['https://www.googleapis.com/auth/gmail.readonly'])
    
    delegated_credentials = credentials.with_subject('Email_address_of_the_recipient')
    

    Adding the email address of the recipient inside the quotes of the code above will tell the code to send the request on behalf of the user who's emails you are trying to access. However, please keep in mind that in order to do so you will need to provide the service account with domain wide delegation