httphttp-1.1

Why is HTTP status-line different from the request-line


Both HTTP Request-Line and the Status-Line have 3 components :

Request-Line= Method       SP Request-URI SP HTTP-Version  CRLF
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

The Status-Line (the Server response) is fine:

What I'm failing to understand is why the Request-Line is so different:

Why it does not follow the same (clean) pattern as the Status-line ?

Request-Line= HTTP-Version SP Method     SP Request-URI CRLF

This way the Request-URI could be any TEXT character (except CR/LF)

So it would look like this:

HTTP/1.1 GET /user/with space
...

HTTP/1.1 404 NOT FOUND
...

See:


Solution

  • It may come from HTTP/0.9, the early protocol version.

    The request part was:

    GET http://www.example.com/foo.html\r\n
    

    And the response part was the response body (without headers), so directly your html response starting with <html> for example.

    The Request Line is:

    METHOD OSP Absolute-Request-URL CRLF
    

    The important point is there is no protocol version, and no protocol part. Both in the response and the request.

    When HTTP/1.0 was created there was the implicit need of still supporting HTTP/0.9 requests and responses. Something that some servers are still doing today.

    On the response side all the response headers parts were added (like stating the mime type of the response!), and the first line was built with this nice idea of starting by the protocol version of the response.

    On the request side the protocol version was added as an optional addition so you could still decide to make a HTTP/0.9 request or a new version, and most importantly, an HTTP/0.9 server could maybe still understand your query (and ignore the SP PROTOCOL addition (and even optionnal headers added in the request). Today if you forgot the protocol part of your request the HTTP/0.9 compatible servers will only parse the first line of your request and ignore extra headers.

    These are equivalent queries (but the first one is in http 0.9 and would get no headers in the response):

    # HTTP 0.9:
    GET http://www.example.com/foo.html\r\n
    # HTTP/1.0 version:
    GET http://www.example.com/foo.html HTTP/1.0\r\n
    \r\n
    # or
    GET /foo.html HTTP/1.0\r\n
    Host: www.example.com\r\n
    \r\n
    #or
    GET http://www.example.com/foo.html HTTP/1.0\r\n
    Host: www.foo.com\r\n
    \r\n
    

    I think they've been thinking about code updates needed in the parsers and that adding the protocol at the end of the first line was easier to implement. Maybe an old parser could still send a 0.9 response to a HTTP/1.0 query (which is bad but easy to write).

    Maybe just adding something on an existing line seems more like an improvment than prefixing the line of the existing protocol.

    Maybe you should have been old enough to comment the RFC at this time and tell them that it would be more elegant your way (which is right) :-)