Can an HTTP proxy ignore the HTTP headers and just blindly pass them on? I am looking at writing a very simple forward proxy and would like to minimize complexity and attack surface.
Just adding some more stuff on @Evert response.
A proxy should try to rewrite a clean HTTP language. Usually you do not alter the body, but that's not the same for headers.
You'll have to parse the header to detect some special headers having impact on the size of the message body (like Content-Length or Transfer-Encoding). So you have a parsed version of the headers. Do not paste the raw version of the received header, instead rewrite it from the parsed version. So such transformation should be applied:
# incoming
<header name>:<SPACE><SPACE><SPACE>HeaderValue\n
<header name>:<TAB>HeaderValue2\n
# out
<header name>:<SPACE>HeaderValue, HeaderValue\r\n
Same thing for the protocol version (end of first line). The reverse proxy must enforce that value (HTTP/1.1 or HTTP/1.0), do not copy paste value from the request.
Then after rfc7230 that @Evert listed, where you would find a lot of MUST/MUST NOT for proxy, you should also have a look at:
The more you read about HTTP, the more it seems quite hard to write a simple proxy.
Most rules in the rfc are there to ensure the proxy will speak a nice HTTP, cleaning up bad or strange HTTP syntax, and that's important against HTTP Smuggling Attacks, which is a big problem for a proxy. If your proxy as a slightly different way of interpreting almost-bad HTTP syntax than other agents, you may end up with bad size interpretation of messages.
Something which is dangerous only if you implement HTTP Pipelining on top of HTTP KeepAlive. So if you want a very simple forward proxy, and are not sure about HTTP Syntax cleanup, ensure no pipelines are supported on your proxy (which is allowed, a server can always treat only the first request of a pipeline):
Connection: close
headersThat's not the way of having a very fast reverse proxy, but that's why it's hard to write a good, fast, robust proxy.