I have issues using boto3/aws cli to copy files between buckets that are in a different region (Note: I am using Scaleway as my cloud provider, not AWS). I could not get it to work using boto3, but managed to find a solution using rclone. I would like to know whether boto3 is still a possibility to limit the number of dependencies in my stack.
When performing a cross-region S3 copy operation using Boto3 (or the AWS CLI), the SourceClient
parameter in Boto3 and the --endpoint-url
parameter in the AWS CLI are not applied consistently. This results in errors when attempting to copy objects from a source bucket in one region to a destination bucket in another region without downloading the objects locally.
Expected Behavior: The object should copy successfully from the source bucket to the destination bucket across regions, using the SourceClient to correctly resolve the source bucket's region.
Actual Behaviour: an error is raised.
botocore.exceptions.ClientError: An error occurred (NoSuchBucket) when calling the CopyObject operation: The specified bucket does not exist
The copy command does not use information from the SourceClient
input, and only uses the info (credentials, location, etc.) from the client on which the copy method was called.
I also tried this with the aws cli, but got the same results:
aws s3 sync s3://source-bucket s3://dest-bucket \
--source-region fr-par \
--region nl-ams \
--endpoint-url https://s3.fr-par.scw.cloud \
--profile mys3profile
The aws cli seems to fall back on an amazonaws endpoint:
fatal error: Could not connect to the endpoint URL: "https://source-bucket.s3.fr-par.amazonaws.com/?list-type=2&prefix=&encoding-type=url"
import boto3
from dotenv import dotenv_values
config = dotenv_values(".env")
# Initialize source and destination clients
s3_session = boto3.Session(
aws_access_key_id=config.get("SCW_ACCESS_KEY"),
aws_secret_access_key=config.get("SCW_SECRET_KEY"),
region_name="fr-par",
)
src_s3 = s3_session.client(
service_name="s3",
region_name="fr-par",
endpoint_url="https://s3.fr-par.scw.cloud",
)
s3_session = boto3.Session(
aws_access_key_id=config.get("SCW_ACCESS_KEY"),
aws_secret_access_key=config.get("SCW_SECRET_KEY"),
region_name="nl-ams",
)
dest_s3 = s3_session.client(
service_name="s3",
region_name="nl-ams",
endpoint_url="https://s3.nl-ams.scw.cloud",
)
# Set up source and destination parameters
copy_source = {
"Bucket": "source_bucket_name",
"Key": "source_object_name",
}
# Attempt to copy with SourceClient
dest_s3.copy(
copy_source,
"destination_bucket_name",
source_object_name,
SourceClient=src_s3
)
I could not get it to work using boto3, but I managed to get a solution that was acceptable to me using rclone.
Example config to be placed in ~.conf/rclone/rclone.conf
:
[scw_s3_fr]
type = s3
provider = Scaleway
access_key_id = ...
secret_access_key = ...
region = fr-par
endpoint = s3.fr-par.scw.cloud
acl = private
[scw_s3_nl]
type = s3
provider = Scaleway
access_key_id = ...
secret_access_key = ...
region = nl-ams
endpoint = s3.nl-ams.scw.cloud
acl = private
sync the source to the destination one-way:
rclone sync scw_s3_fr:source-bucket scw_s3_nl:destination-bucket -P --metadata --checksum --check-first
Does anybody know what I did wrong here? Or could guide me in the right direction to get the configuration setup right. My short-term needs are currently all set, but I wonder if a pure-boto3 solution is still possible.
Python 3.11.2 (main, Mar 7 2023, 16:53:12) [GCC 12.2.1 20230201] on linux boto3='1.35.66'
The issue you’re facing with Boto3 when trying to copy files between buckets in different regions on Scaleway arises because the copy method in Boto3 isn’t fully compatible with non-AWS S3 implementations. Specifically, the SourceClient parameter doesn’t properly resolve the endpoint for the source bucket when working with Scaleway, that's why you end up with errors like NoSuchBucket.
in order to make Boto3 work for cross-region copies in this context, you have to copy and put the objects by yourself, rather than using the copy:
import boto3
from dotenv import dotenv_values
config = dotenv_values(".env")
src_s3 = boto3.client(
service_name="s3",
region_name="fr-par",
endpoint_url="https://s3.fr-par.scw.cloud",
aws_access_key_id=config.get("SCW_ACCESS_KEY"),
aws_secret_access_key=config.get("SCW_SECRET_KEY"),
)
dest_s3 = boto3.client(
service_name="s3",
region_name="nl-ams",
endpoint_url="https://s3.nl-ams.scw.cloud",
aws_access_key_id=config.get("SCW_ACCESS_KEY"),
aws_secret_access_key=config.get("SCW_SECRET_KEY"),
)
source_bucket = "source_bucket_name"
destination_bucket = "destination_bucket_name"
object_key = "source_object_name"
response = src_s3.get_object(Bucket=source_bucket, Key=object_key)
object_data = response["Body"].read()
#Upload after Download
dest_s3.put_object(Bucket=destination_bucket, Key=object_key, Body=object_data)
The AWS copy failed because The SourceClient parameter is designed for AWS-specific cross-region copying but doesn’t work for non-AWS S3 providers like Scaleway due to strict endpoint URL resolution and its reliance on AWS-specific behaviors.