amazon-web-servicesamazon-s3

How to check if a file uploaded to AWS S3 has the same content as a local file


We believe that the S3 checksum is useless for huge files larger than 1GB, as it is a further hashed value of chunks separated by an arbitrary number of bytes.

There is a 1 GB file uploaded to AWS S3. The SHA256 checksum value is "o9mK1Ay32kIpvW157S40b/2siazR/+tpuz6OYCsjNBU=-2620".

Is there any way to verify that the file locally and the file uploaded to S3 are identical in content? Without downloading of course.

I am hoping to use the AWS SDK or cli to calculate the same checksum value from the local file


Solution

  • I was able to get all hash values per chunk with the get-object-attributes command. The byte counts are explicitly listed, so I assume they are complete.

    [cloudshell-user ~]$ aws s3api get-object-attributes --bucket xxx --key "xxx.bin"    --object-attributes "Checksum,ObjectParts"
    {
        "LastModified": "2024-07-10T20:56:50+00:00",
        "Checksum": {
            "ChecksumSHA256": "ZQHQKvGsIHCdKb9APtdkmY4RH/FwucuznEoShkrEPsw="
        },
        "ObjectParts": {
            "TotalPartsCount": 3106,
            "PartNumberMarker": 0,
            "NextPartNumberMarker": 1000,
            "MaxParts": 1000,
            "IsTruncated": true,
            "Parts": [
                {
                    "PartNumber": 1,
                    "Size": 5242880,
                    "ChecksumSHA256": "nkyKW7CVp1UjpQX66AHwRp3tMTJmpguNoyz+S5lcwt8="
                },
    略