pythonapirestpostget

Rest - GET vs POST


I want to design an API call following REST principles.

Let's assume I want to get users info where Users are identified with ids.

With a GET method, the call would look : /users?id=XXXXX, YYYYY, ....

The problem with this is : what happens when the list of users sent is too big ? The URI size limit is reached.

With a POST method, the call would look like : /users

and the request body would look like :

{
    "users": [XXXXX, YYYYY]
}

As far as I know, GET method should only be used to read data, and POST method to create new resources.

How should I design this properly ?


Solution

  • As far as I know, GET method should only be used to read data, and POST method to create new resources.

    This isn't right. The currently registered reference for both GET semantics and POST semantics is RFC 9110. In summary:

    POST only becomes an issue when it is used in a situation for which some other method is ideally suited -- Fielding, 2009

    The underlying tension in your case is this: GET stops being ideally suited when your resource identifiers start running into URI length limits -- (as of RFC 9110, the recommended minimum supported length of a URI is 8000 octets).

    So at that point, we need to give up on the benefits we realize from using GET, and revert back to using POST with the identifying information encoded into the request body.


    But the question is around the design. How would you create an API where you should be able to : read all users, read a list of users, and also create new users. All respecting REST principles.

    Think about how you would do it with web pages.

    1. If we had a large number of users, and expected that clients would want to be able to iterate through the list, then we'd probably introduce a bunch of page resources, with first/previous/next/last link relations to allow you to navigate through the pages.

    2. If we wanted to support queries for arbitrary collections of entries, we'd probably have a form to collect the information from the user, and then combine that information with a URI template to carry that information to the server.

    3. To support an arbitrarily long list of users (too clumsy to use in a URI) we'd use a similar form, but with a POST method instead of a GET method; so the target resource for the request would be identified by the form action, the client's information contained in the payload.

    3a) Because of the HTTP's cache invalidation rules, we probably don't want the identifier for a cacheable page, because the search is effectively read only, and that shouldn't invalidate previously cached responses. So it's likely that this is going to be a URI specific to this particular interaction

    3b) If the request really does have a large data set, then you might want to introduce paging in the response - in effect "creating" new resources on the server that are the result set for this specific query, which would themselves be linked together for easy paging.

    3c) If the representations are going to be large (because you don't want paging; or because the pages are really big) then range requests might be useful.

    1. "Creating" new resources on the server is again: submit a form. The most reasonable candidate URI would be of the resource that will be changed when the new resource is created - the "first" page in the list if your sort order puts the most recently added at the front, for example. (Cache invalidation in HTTP isn't infinitely flexible; if many resources are changed by a form submission, then some of the client's pages will be out of date and that's just too bad.)

    You'll want resource identifiers that are consistent with RFC 3986 and RFC 9110; you'll make things a lot easier on everyone (including yourself) if you restrict yourself to spellings that are described by URI Templates (see RFC 6570).

    Encoding information in the path vs encoding information in the query is a matter of trade-offs: using path segments is convenient for relative resolution; using the query part is convenient for HTML forms.

    Beyond that, you've got a lot of freedom, much like you do for choosing variable names in a programming language -- the machines don't care, so use the extra degrees of freedom to make things easier for the humans you care about.

    /users/byPage
    /users/byQuery
    /users/byPost
    

    is fine; operators reviewing your access logs will be able to interpret the identifiers fairly easily.