amazon-web-servicesamazon-s3eventual-consistency

S3 last-modified timestamp for eventually-consistent overwrite PUTs


The AWS S3 docs state that:

Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions.

http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel

The timespan until full consistency is reached can vary. During this period GET requests may return the previous object or the udpated object.

My question is:

When is the last-modified timestamp updated? Is it updated immediately after the overwrite PUT succeeds but before full consistency is reached, or is it only updated after full consistency is achieved?

I suspect the former but I can't find any documentation which clearly states this.


Solution

  • The Last-Modified timestamp should match the Date value returned in the response headers from the successful PUT request.

    To my knowledge, this is not explicitly documented, but it can be derived from what is documented.

    When you overwrite an object, it's not the overwriting itself that may be delayed by the eventual consistency model -- it's the availability of the overwritten content at a given S3 node (S3 is replicated to multiple nodes within the S3 region).

    But note that this answer was written in 2016, and in 2020, S3 announced that eventual consistency should no longer be a concern:

    Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags, ACLs, or metadata, are now strongly consistent. What you write is what you will read, and the results of a LIST will be an accurate reflection of what’s in the bucket. This applies to all existing and new S3 objects, works in all regions, and is available to you at no extra charge! There’s no impact on performance, you can update an object hundreds of times per second if you’d like, and there are no global dependencies.

    https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/

    The Last-Modified timestamp, like the rest of the metadata, is established at the time of object creation and immutable, thereafter.

    It is, in fact, not the "modification" time of the object at all, it is the creation time of the object. The explanation may sound pedantic, but it is accurate in the strictest sense: S3 objects and their metadata cannot in fact be modified at all, they can only be overwritten. When you "overwrite" an object in S3, what you are actually doing is creating a new object, reusing the old object's key (path+file name).

    The official documentation is using very casual terminology, here:

    The object creation date or the last modified date, whichever is the latest.

    https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html

    That's just not correct in a literal sense, because objects themselves cannot be modified -- even "editing" object metadata creates an entirely new copy of the object with the new metadata. The content associated with a specific object key can be "modified" -- by overwriting the object -- and that's what they're actually speaking of, here.

    Theoretically (writing now in 2023), replication delays are effectively a thing of the past, but then as now, Last-Modified would not have been impacted.

    The availability of this new object at a given S3 node (replication) is what may be delayed by the eventual consistency model... not the actual creation of the new object that overwrites the old one... hence there would be no reason for Last-Modified to be impacted by a replication delay (assuming there is a replication delay -- eventual consistency can at times be indistinguishable from immediate consistency).