httpproxytransparentproxy

Should transparent HTTP proxy remove hop HTTP headers?


I read that HTTP proxy should be removing hop HTTP headers (https://www.freesoft.org/CIE/RFC/2068/143.htm)

It makes sense since some of these headers are connection-related.

The question is. Is this RFC applicable for explicit proxy only or should be it be done on transparent HTTP proxies too?

Just to give you an example. Let say a client does HTTP call and it has an explicit proxy set. However, there is a transparent proxy in the middle. So, the overall pipeline looks like that

Client ↔ Transparent Proxy ↔ Explicit proxy ↔ Web page

An explicit proxy may require authentication and will send back Proxy-Authenticate header.

If a transparent proxy removes this header (per RFC) then the client won't be prompted to authenticate and nothing will work.

This one jumped out immediately, but I think some other scenarios could be envisioned when it looks like transparent proxy should NOT be touching hop-by-hop headers.

Am I missing something or hop-by-hop removal rules are applicable to explicit proxies only?


Solution

  • Transparent proxies don’t exist.

    As far as the HTTP RFC is concerned, there is simply no such thing. The specification does not recognise the concept. A client (A) may connect to a server (C) to fetch or modify a resource, or it may connect to a proxy (B) to have the latter do so on its behalf. In the former case, the hop-by-hop headers regulate the connection between the client and the server; in the latter, they regulate the connection between the client and the proxy. If the proxy connects to the server to serve the request, it has to manage its own hop-by-hop headers for the proxy–server link.

    Anything else you add beyond that is simply not a party to the protocol and its presence should not influence how it operates. Whether (A)’s connection to either (B) or (C) (or (B)’s connection to (C)) is mediated by something else is immaterial. All that matters is that when (A) chooses to send a request to (B), it should receive the same resource that it would if it chose to make a request to (C) directly. (B) or (C) don’t even have to be single hosts; they may themselves pass requests though any number of intermediary layers.

    For all it matters, the ‘transparent proxy’ may as well be a SOCKS proxy, in which case it will not modify any HTTP headers at all, because it cannot even be sure whether what it forwards is HTTP in the first place.