I'm writing a Python script to upload a large file (5GB+) to an s3 bucket using a presigned URL. I have a javascript version of this code working, so I believe the logic and endpoints are all valid.
For each part of the file, I get a presigned multipart upload URL, then I attempt a PUT request to that URL:
offset = 0
part_number = 0
with open(file_path, 'rb') as f:
while offset < file_size_bytes:
# Get a presigned URL for this chunk
get_multipart_upload_url_params = {
"partNumber": part_number,
"uploadId": upload_id,
"Key": file_key,
}
get_multipart_upload_url_response = requests.get(GET_MULTIPART_UPLOAD_URL_ENDPOINT, params=get_multipart_upload_url_params)
if 'uploadURL' not in get_multipart_upload_url_response.json():
print("Error: Upload Part URL not found in response")
sys.exit(1)
chunk_upload_url = get_multipart_upload_url_response.json()['uploadURL']
# Upload the chunk
remaining_bytes = file_size_bytes - offset
chunk_size = min(MAX_CHUNK_SIZE, remaining_bytes)
chunk = f.read(chunk_size)
if not chunk:
break
response = requests.put(chunk_upload_url, data=chunk)
...
When requests.put executes, I see an error that looks like:
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bucket-name.s3.amazonaws.com', port=443): Max retries exceeded with url: [PRESIGNED URL REDACTED] (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2426)')))
What's extra confusing about this is that when I implement a single-part upload function, it works fine using the same interface:
# Get presigned upload URL
upload_response = requests.get(SINGLE_PART_UPLOAD_API_ENDPOINT, params={
'filename': filename,
}).json()
if 'uploadURL' not in upload_response or 'Key' not in upload_response:
print("Error: Upload URL or file key not found in response")
sys.exit(1)
upload_url = upload_response['uploadURL']
file_key = upload_response['Key']
# Upload the file using requests
print(f"Uploading: {file_path}")
with open(file_path, 'rb') as f:
response = requests.put(upload_url, data=f, headers={"Content-Type": "application/octet-stream"})
...
Some of the things I've tried:
The problem was that partNumber is 1 indexed and I was setting the initial part number value to 0. I will leave this post up in the hopes that it helps someone else in the future.
Reference: https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html