In short:
Let’s say s-maxage
is one day and max-age
is one hour. The proxy cache will keep a resource for a day, but after a few hours the Age
header will be more than one hour. The browser sees the resource is older than one hour and won’t cache it locally. How to cache it locally in the browser regardless?
I'm trying to combine Cache-Control: max-age
and s-maxage
with sensible values.
But setting s-maxage > max-age
doesn't seem to make sense.
Eventually the browser will always revalidate resources and will skip local browser cache, because resources received from proxy cache will immediately be stale (Age > max-age
).
Goals:
s-maxage
), this can be purged when necessary.max-age
).Problem:
A resource stays in proxy cache for a long time (s-maxage
), while its Age
increases (time since fetch from origin).
Eventually the Age
of the resource in cache will be larger than max-age
.
When that happens the browser will revalidate every time it needs a resource, since the resource is stale on every request.
For example: Cache-Control: max-age=60, s-maxage=86400
.
The browser should keep a resource for 60 seconds. The proxy cache keeps a resource for a day.
t=0
browser: need resource
proxy cache: fetch resource from origin
-> cache returns fresh resource with Age: 0 / Cache-Control: "max-age=60, s-max-age=86400"
t=30
browser: locally cached resource is still fresh: Age (0+30) < max-age (60)
t=70
browser: local cache is stale: Age (0+70) > max-age (60) -> revalidate
proxy cache: cached resource is still fresh: Age (70) < s-maxage (86400)
-> cache returns resource with Age: 70, Cache-Control: "max-age=60, s-maxage=86400"
The cache returned a resource with an Age (70)
that is larger than max-age (60)
.
From now on every time the browser wants the resource it will be stale locally and needs revalidation.
t=75
browser: local cache is stale: Age (70+5) > max-age (60) -> revalidate
proxy cache: cached resource is still fresh: Age (75) < s-maxage (86400)
-> cache returns resource with Age: 75 / Cache-Control: "max-age=60, s-max-age=86400"
This means that if a resource is in proxy cache for longer than max-age
the browser will always revalidate.
The max-age
value is only useful for max-age
seconds after getting a fresh resource from origin.
Setting s-maxage > max-age
doesn't seem to make sense.
Your analysis looks correct, and I think I agree. Specifically, given a particular value of s-maxage
it’s hard to see any reason why you’d want to use a smaller max-age
, since that will result in pointless conditional validation requests.
Note that there are reasonable use cases for setting max-age > s-maxage
, so it still makes sense for the protocol to define these as two separate directives.
How can I force the browser to cache a resource for a specific amount of time after receiving it, regardless of its freshness (as indicated by the Age
header)?
HTTP caching is based on the age of the resource, not when a response happened to be received. So there’s no way to force the user’s browser or the proxy cache to do this.
But my cache plan is perfectly reasonable: I want to set max-age
to a small value that accurately represents the degree of staleness I can tolerate, but set s-maxage
to a long value and simply invalidate the proxy cache when the resource changes. Why won’t the HTTP specification support that?
The fundamental issue here is that your—indeed, perfectly reasonable—cache scheme relies on having control over the proxy cache (to force invalidation), whereas the internet architecture defined by HTTP is based on independent actors that the origin server doesn't have control over. What you’re describing as a proxy cache is better thought of as a managed cache. MDN has a useful discussion of this:
Shared caches can be further sub-classified into proxy caches and managed caches.... Managed caches are explicitly deployed by service developers to offload the origin server and to deliver content efficiently....
In most cases, you can control the [managed] cache's behavior through the
Cache-Control
header and your own configuration files or dashboards. For example, the HTTP Caching specification essentially does not define a way to explicitly delete a cache—but with a managed cache, the stored response can be deleted at any time through dashboard operations, API calls, restarts, and so on. That allows for a more proactive caching strategy.
So if you’re using a managed cache (Cloudfront, etc.) you don’t need to use s-maxage
at all to manage the cache. You can directly control the settings outside of the HTTP protocol.
Even if I use settings to control the managed cache retention period, I still face the issue of the Age
header forcing the browser to revalidate.
In the same way that the managed cache exposes TTL settings, it could also expose the ability to send a fresh Age
header on each response. Whether any particular managed cache solution does that, I don’t know.