pythonamazon-s3http-headersboto3botocore

How to set Content-Type header with boto3 presigned url multipart upload


Is there any way to allow Content-Type header with multipart uploads to presigned s3 url?

Let's begin with the following code:

import boto3
import requests


BUCKET_NAME = "foo"

# No, it's global in this MRE only
client = boto3.client('s3')


def create(key):
    response = client.create_multipart_upload(Bucket=BUCKET_NAME, Key=key)
    return response['UploadId']


def get_url(key, upload_id, chunk_number):
    signed_url = client.generate_presigned_url(
        ClientMethod='upload_part',
        Params={
            "Bucket": BUCKET_NAME,
            "Key": key,
            "PartNumber": chunk_number,
            "UploadId": upload_id,
            # "ContentType": "application/x-www-form-urlencoded",
        },
        ExpiresIn=60 * 60,  # seconds
    )
    return signed_url


def complete(key, upload_id, parts):
    client.complete_multipart_upload(
        Bucket=BUCKET_NAME,
        Key=key,
        UploadId=upload_id,
        MultipartUpload={"Parts": parts}
    )


def test_upload():
    key = 'data/foo.bar'
    upload_id = create(key)

    url = get_url(key, upload_id, 1)
    with open("/tmp/foo.bar", "rb") as src:
        response = requests.put(url, data=src.read())

    etag = response.headers["ETag"]
    complete(key, upload_id, {"ETag": etag, "PartNumber": 1})

Hooray, this works. However, let's try to do the same from frontend, replacing requests call with

fetch(uploadTo, {
    method: 'PUT',
    body: blob,
})

(no matter how blob is defined here, this is irrelevant to our problem).

And this fails, returning 403 SignatureDoesNotMatch. Why? Because Content-Type header is set (and fetch cannot do without it), and this is part of S3-side backend signature verification. Content-Type is not a part of generated URL, so with any content type fetch tries to set this will not match. I know this is the case, because here's what response looks like (with only URL being different, ignore this incompatibilities, uploadId=1 is just a fake - same thing happens with real URL; pay attention to StringToSign tag):

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code>
  <Message>The request signature we calculated does not match the signature you
    provided. Check your key and signing method.</Message>
  <AWSAccessKeyId>...</AWSAccessKeyId>
  <StringToSign>PUT
    application/x-www-form-urlencoded
    1684064749
    x-amz-security-token:FwoGZXIvYXdzEOT//////////wEaDE+2gqVw4NTt1c2eOCKGAVTf4uDCD+GJ8P6lG2vBg8yQ2dqyU7/6aHg4hXMljyDFByT7hJ1/F/GPwBi84eAMDZqGzXpIySe8PhU80ak5C4vg7vcGOOSaB3cXk7TtQ2q0pWb8MB0AYb3LGAJ6sahySjHSdArFFADB60u6SskWhq9HHSijilW9hKIiUgdceZAPhLH1J59oKITngqMGMijigFpTERPtZLB+MOjIqJIpHvPJrrfRg4mzwAmZbk+rropyYha4rBNP
    /sim-cal-bucket/temp/foo.mp4?partNumber=1&amp;uploadId=1</StringToSign>
  <SignatureProvided>miwrnxtxoPdGnuAEqiP52ZMscBQ=</SignatureProvided>
  <StringToSignBytes>50 55 54 0a 0a 61 70 70 6c 69 63 61 74 69 6f 6e 2f 78 2d
    77 77 77 2d 66 6f 72 6d 2d 75 72 6c 65 6e 63 6f 64 65 64 0a 31 36 38 34 30
    36 34 37 34 39 0a 78 2d 61 6d 7a 2d 73 65 63 75 72 69 74 79 2d 74 6f 6b 65
    6e 3a 46 77 6f 47 5a 58 49 76 59 58 64 7a 45 4f 54 2f 2f 2f 2f 2f 2f 2f 2f
    2f 2f 77 45 61 44 45 2b 32 67 71 56 77 34 4e 54 74 31 63 32 65 4f 43 4b 47
    41 56 54 66 34 75 44 43 44 2b 47 4a 38 50 36 6c 47 32 76 42 67 38 79 51 32
    64 71 79 55 37 2f 36 61 48 67 34 68 58 4d 6c 6a 79 44 46 42 79 54 37 68 4a
    31 2f 46 2f 47 50 77 42 69 38 34 65 41 4d 44 5a 71 47 7a 58 70 49 79 53 65
    38 50 68 55 38 30 61 6b 35 43 34 76 67 37 76 63 47 4f 4f 53 61 42 33 63 58
    6b 37 54 74 51 32 71 30 70 57 62 38 4d 42 30 41 59 62 33 4c 47 41 4a 36 73
    61 68 79 53 6a 48 53 64 41 72 46 46 41 44 42 36 30 75 36 53 73 6b 57 68 71
    39 48 48 53 69 6a 69 6c 57 39 68 4b 49 69 55 67 64 63 65 5a 41 50 68 4c 48
    31 4a 35 39 6f 4b 49 54 6e 67 71 4d 47 4d 69 6a 69 67 46 70 54 45 52 50 74
    5a 4c 42 2b 4d 4f 6a 49 71 4a 49 70 48 76 50 4a 72 72 66 52 67 34 6d 7a 77
    41 6d 5a 62 6b 2b 72 72 6f 70 79 59 68 61 34 72 42 4e 50 0a 2f 73 69 6d 2d
    63 61 6c 2d 62 75 63 6b 65 74 2f 74 65 6d 70 2f 66 6f 6f 2e 6d 70 34 3f 70
    61 72 74 4e 75 6d 62 65 72 3d 31 26 75 70 6c 6f 61 64 49 64 3d 31</StringToSignBytes>
  <RequestId>073V6QJXMA0XAKWS</RequestId>
  <HostId>1pR1Pz4RSnRilgjUbb0AVDcMWiqCq05dMrAVU+0t4a0HF5ytfXmNiIecxH80urVoiKtxtHhxS2o=</HostId>
</Error>

So, we need to pass a Content-Type to signed url somehow. Neither generate_signed_url nor its Params (that must match params of upload_part) accept ContentType option. This looks like a dead end... To double confirm, here's what works in JS - ContentType is passed to the signer.

Well, for now I'm just monkeypatching botocore to allow passing ContentType parameter to upload_part (cloned botocore/data/s3/2006-03-01/service-2.json and added this parameter to UploadPartRequest definition, patching this file in venv in Dockerfile), but it's certainly not what I want. However, this confirms that I really need to pass ContentType, and no other solution can allow setting this header. After uncommenting ContentType key in the sample above, everything is fine.

Just to compare, below are urls without and with content type - this arg is included in url directly. The latter URL works with frontend fetch flawlessly.

https://sim-cal-bucket.s3.amazonaws.com/temp/foo.mp4?partNumber=1&uploadId=1&AWSAccessKeyId=...&Signature=miwrnxtxoPdGnuAEqiP52ZMscBQ%3D&x-amz-security-token=FwoGZXIvYXdzEOT%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDE%2B2gqVw4NTt1c2eOCKGAVTf4uDCD%2BGJ8P6lG2vBg8yQ2dqyU7%2F6aHg4hXMljyDFByT7hJ1%2FF%2FGPwBi84eAMDZqGzXpIySe8PhU80ak5C4vg7vcGOOSaB3cXk7TtQ2q0pWb8MB0AYb3LGAJ6sahySjHSdArFFADB60u6SskWhq9HHSijilW9hKIiUgdceZAPhLH1J59oKITngqMGMijigFpTERPtZLB%2BMOjIqJIpHvPJrrfRg4mzwAmZbk%2BrropyYha4rBNP&Expires=1684064749
https://sim-cal-bucket.s3.amazonaws.com/temp/foo.mp4?partNumber=1&uploadId=1&AWSAccessKeyId=...&Signature=1FeHhXi7QRtL0wCT7kJ%2BVEcBeso%3D&content-type=application%2Fx-www-form-urlencoded&x-amz-security-token=FwoGZXIvYXdzEOT%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDE%2B2gqVw4NTt1c2eOCKGAVTf4uDCD%2BGJ8P6lG2vBg8yQ2dqyU7%2F6aHg4hXMljyDFByT7hJ1%2FF%2FGPwBi84eAMDZqGzXpIySe8PhU80ak5C4vg7vcGOOSaB3cXk7TtQ2q0pWb8MB0AYb3LGAJ6sahySjHSdArFFADB60u6SskWhq9HHSijilW9hKIiUgdceZAPhLH1J59oKITngqMGMijigFpTERPtZLB%2BMOjIqJIpHvPJrrfRg4mzwAmZbk%2BrropyYha4rBNP&Expires=1684065651

Solutions suggesting to make a request without Content-Type are unacceptable, because this is a part of public API, and I do not want to make customers jump through hoops trying to send such request.


Solution

  • Since the issue (original, following) is not going to be resolved any soon (apparently), here's what we are doing now. The first snippet works for older botocore (1.28.1 certainly), and the second works for newer where this definition was gzipped. I'm not posting entire Dockerfile since it contains too many irrelevant details, but this extract should be sufficient (in fact it's a part of a longer RUN commands chain).

    Make sure to adjust venv location according to your setup. You will need jq installed before (e.g. via apt-get update && apt-get install -y jq && rm -rf /var/lib/apt/lists/* for Debian/Ubuntu images).

    # botocore 1.28.1
    # Monkey-patch botocore to allow passing ContentType to presigned S3 upload URL
    ARG BOTOCORE_PATCH_FILE=".venv/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/service-2.json"
    RUN jq '.shapes.UploadPartRequest.members |= . + { \
            "ContentType": { \
              "shape": "ContentType", \
              "documentation": "<p>Upload Content-Type.</p>", \
              "location": "header", \
              "locationName": "Content-Type" \
            } \
        }' $BOTOCORE_PATCH_FILE > patch.json \
        && mv patch.json $BOTOCORE_PATCH_FILE
    

    For command substitution, make sure to require bash shell.

    # botocore 1.34.10
    SHELL ["/bin/bash", "-e", "-u", "-x", "-o", "pipefail", "-c"]
    # ...
    # Monkey-patch botocore to allow passing ContentType to presigned S3 upload URL
    ARG BOTOCORE_PATCH_FILE=".venv/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/service-2.json.gz"
    RUN jq '.shapes.UploadPartRequest.members |= . + { \
            "ContentType": { \
              "shape": "ContentType", \
              "documentation": "<p>Upload Content-Type.</p>", \
              "location": "header", \
              "locationName": "Content-Type" \
            } \
        }' <(gunzip -c $BOTOCORE_PATCH_FILE) > patch.json \
        && gzip -c patch.json > $BOTOCORE_PATCH_FILE \
        && rm patch.json