restlast-modifiedif-modified-sincerfc2616

Is using the If-Modified-Since header to filter a resource collection to only recent ones in a REST API considered a valid approach?


I'm designing a REST API where I have a need to provide the option to GET only the resources in a collection that were created or modified recently, based on a client-provided timestamp (which, in turn, will have been generated by the API in a previous response). I'm considering the use of the Last-Modified and If-Modified-Since headers for this purpose.

Earlier questions here (like Is it valid to modify a REST API representation based on a If-Modified-Since header?) seems to indicate that this is frowned upon, on the grounds that RFC2616 indicates that the purpose of these headers is related to caching. However, since then, RFC2616 has been superseded by RFC7232, which states that

If-Modified-Since is typically used for two distinct purposes: 1) to allow efficient updates of a cached representation that does not have an entity-tag and 2) to limit the scope of a web traversal to resources that have recently changed.

My interpretation is that my use case of allowing retrieval of all changes to the collection since the last retrieval is covered by the second purpose.

So I have two questions:

  1. Is this interpretation correct, or am I missing something subtle here?
  2. Even if my interpretation is correct, does that make it a good practice to use these headers in this way? In other words: what other reasons would there be to not use these headers after all and instead, for example, include a timestamp in the response and allow the client to provide that back in the query string for the next request?

Solution

  • Is this interpretation correct, or am I missing something subtle here?

    I believe RFC 7234 contradicts your interpretation.

    If an If-None-Match header field is not present, a request containing an If-Modified-Since header field (Section 3.3 of [RFC7232]) indicates that the client wants to validate one or more of its own stored responses by modification date. A cache recipient SHOULD generate a 304 (Not Modified) response (using the metadata of the selected stored response) if one of the following cases is true....

    The broad problem here is that a general purpose cache isn't going to know that your resource / your server have a different understanding of what the standard headers mean, and therefore clients are not going to have the experience you want.

    Furthermore...

    I'm designing a REST API where I have a need to provide the option to GET only the resources in a collection that were created or modified recently, based on a client-provided timestamp (which, in turn, will have been generated by the API in a previous response).

    We already have a standardized mechanism for this - it's the URI. That may become clearer if you review Fielding's definition of resource.

    I understand it this way: "resource", within the context of REST, is a generalization of "document" (see also Jim Webber, 2011). It's perfectly reasonable to have many different documents derived from the same (or overlapping) information.

    Think "paging" - every page is a different document, with its own unique identifier, but all of the pages are being derived from the same common source, with items moving from one page to another over time.