pythonamazon-web-servicesamazon-s3

How to generate URL to download file from S3 bucket


I would like to obtain URLs pointing to cloud-optimized geoTIFFs from Amazon's Copernicus Digital Elevation Model bucket.

After installing boto3 (with pip3 install boto3), I do, relying on this answer to the question Can I use boto3 anonymously? to download a single file:

import boto3
from botocore import UNSIGNED
from botocore.client import Config

s3 = boto3.client('s3', region_name='eu-central-1', config=Config(signature_version=UNSIGNED))

Then I query for list of objects in the bucket, using the second line of this answer to the question Use boto3 to download from public bucket:

objects = s3.list_objects(Bucket='copernicus-dem-30m')

I then access to a value in objects['Contents'], the first one, for example (ie index 0):

key = objects['Contents'][0]['Key']

key is now:

Copernicus_DSM_COG_10_N00_00_E006_00_DEM/Copernicus_DSM_COG_10_N00_00_E006_00_DEM.tif

I download this file by doing:

s3.download_file('copernicus-dem-30m', key, key.split('/')[-1])

Instead of downloading, how can I generate a URL, which later I can use to download the file, maybe using wget or just pasting it to a browswer?


This code shown above is based on the thread: How to get Copernicus DEM GeoTIFFs for a bounding box using Python.


Solution

  • See Geoffrey’s answer for the format of the S3 URLs for public access buckets.

    To generate a URL that works regardless of whether the bucket/object is public, you can use generate_presigned_url:

    s3.generate_presigned_url(
        'get_object',
        Params = {'Bucket': 'copernicus-dem-30m', 'Key': key},
        ExpiresIn = SIGNED_URL_TIMEOUT
    )
    

    … with a suitably chosen SIGNED_URL_TIMEOUT (in seconds).